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for each potential A G A’ a transfer operator acting on A(0). Under 
suitable hypotheses, it is well-known that has a maximal eigenvalue A^, 
has a spectral gap and dehnes a unique Gibbs measure [ia- Moreover there is 
a unique normalized potential of the form B := A + f — f o T + c acting as a 
representative of the class of all potentials dehning the same Gibbs measure. 

The goal of the present article is to study the geometry of the set of 
normalized potentials A/”, of the normalization map A B, and of the 
Gibbs map A^ y a- We give an easy proof of the fact that N is an analytic 
submanifold of X and that the normalization map is analytic; we compute 
the derivative of the Gibbs map; last we endow M with a natural weak 
Riemannian metric (derived from the asymptotic variance) with respect to 
which we compute the gradient flow induced by the pressure with respect to 
a given potential, e.g. the metric entropy functional. We also apply these 
ideas to recover in a wide setting existence and uniqueness of equilibrium 
states, possibly under constraints. 


'Supported by Brazilian-French Network in Mathematics 

"Supported by the Agence Nationale de la Recherche, grant ANR-ll-JSOl-0011. 

'"Supported by CNPq-Brazil and INCTMat. 

'"Partially supported by CNPq-Brazil through the postdoctoral scholarship PDJ/501839/2013-5. 
2010 Mathematics Subject Classification: 37D35, 37A35, 37C30, 49Q20 
Keywords: Transfer operators. Equilibrium states. Entropy, Regularity, Wasserstein space 


1 



1 Introduction 


The goal of this article is to propose a differential-geometric approach to the thermo¬ 
dynamical formalism for maps whose transfer operators satisfy the conclusion of the 
Ruelle-Perron-Frobenius theorem (for example, expanding maps). 

Some of our results stated below are already known for certain dynamical systems (see 
later for more precise historical references); let us stress what we believe are our main 
contributions; 

• we propose a point of view, based on differential geometry in the space of po¬ 
tentials, which provides new and efficient^ proofs of strong results (e.g. Frechet 
derivatives are computed instead of mere directional derivatives) valid in a fairly 
general framework, 

• we compute an explicit formula for the derivative of f (fd/UA with respect to A 
(Theorem C), leading naturally to the variance metric linking together several 
natural quantities (Theorem D), 

• we use this metric to define the gradient of natural functionals, which leads to a 
gradient flow modeling a system out of equilibrium (Section 7.3.2), 

• we show that the map sending a potential to its Gibbs measure is very far from 
being smooth in the sense of optimal transportation (Theorem E), 

• we improve a result of Kucherenko and Wolf, identifying precisely the equilibrium 
state of a potential under a hnite set of linear constraints (Theorem G, see also 
Example H). 

1.1 Transfer operator, Gibbs measures and normalization 

Let G be a compact metric space and T : G —)■ G be a finite-to-one map, defining a 
discrete-time dynamical system. The model cases are uniformly expanding maps such 
as X ^ dx mod 1 on the circle or the shift over right-infinite words on a finite alphabet, 
but we shall consider a very general setting by mostly asking^ that for each potential 
A in a suitable function space T’(G), the Ruelle-Perron-Frobenius theorem holds for the 
transfer operator 


: T’(G) ^ T’(G) 
f -.x^ ^ e^^y^f{y) 

T{y)=x 

i.e. has a positive, simple leading eigenvalue Ayi associated with a positive eigen¬ 
function Ua'i its dual operator acting on measures has a unique eigenprobability 


^See for example Corollary 3.6. 

^see Section 2 for the precise hypotheses 
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va'i and has a spectral gap below Aa- Then, the measure ha = (where the 

multiplicative constant in Ha is chosen so as to make fiA a probability measure) is an 
invariant measure for T, which we will call here the Gibbs measure associated to the 
potential A. 

Two different potentials A, B which differ by a constant and a coboundary define the 
same Gibbs measure, which can thus be parametrized by the quotient space Q = X (G )/C 
where 

C = {c + 5f — 5foT|cGM, (?€ T’(f2)}. 

The subset M C T’(f2) of normalized potential (i.e. such that Aa = 1 and Ha = 1) 
contains exactly one representative of each class modulo C, making A/” another natural 
parameter space for Gibbs measures. 

Our main object of study is the first-order variation of pA with respect to A, which 
means we consider pa+c for small ( G T’(G); of course, adding to C a constant and a 
coboundary will have no effect. In the literature, it is often the case that one asks ( 
to satisfy the normalizing condition J ( dpA = 0 to get rid of the constant, and then 
considers ( up to coboundaries. We argue that instead, it makes things simpler and 
clearer to go fully with one of these points of view; either consider both A and ( modulo 
C, or restrict to normalized A and constrain ( to be tangent to Af. Our fist result gives 
a solid ground to this principle (Theorem 3.4); 

Theorem A. The set M of normalized potentials is an analytic submanifold of X{Q) 
and its tangent space at A is ker^A, which is a topological complement to C. 

From this we will easily deduce the analyticity and derivative of several important 
maps. Gonsider; 

• the normalization map N ; ^(0) —)■ M which sends A to the unique normalized 
potential in its class modulo C, 

• the leading eigenvalue map A ; A i—)■ Aa, 

• the leading eigenfunction map H : A Aa, 

• the Gibbs map G : A pA taking its values in ^(0)* with the convention that 

Ta{t) = I TdpA- 

Then we get (Theorem 3.5 to Proposition 3.7); 

Corollary B. The maps N, A, H, G are analytic and for all A G A:’(f2).- 

• the differential DNa of N at A is the linear projection on ker.ifAr(A) along C, 

• D(log A)A = Pa os a linear form on X{TL), i.e. 




t=0 



VA,C G A’(G). 
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The analyticity of these maps and the derivative of log A are well-known for many 
dynamical systems,^ however our framework is quite general (e.g. we do not assume 
any high-temperature hypothesis) and our method pretty elementary: we only use basic 
differential calculus, not complex analysis nor Kato’s theory of regularity of eigendata 
for operators (as done, for example, in [PP90], [dSdSSld]). 


1.2 From integral differentiation to a Riemannian metric 

Both derivatives above are really easy to obtain, but the derivative of G is slightly more 
complicated (Theorem 4.1); 

Theorem C. For all A,Lp,( G A:’(r2) we have 


— j ipdnA+tc = j (I-^n{A)) \(Pa) ■ DNAiOdnA 

where (fA = F~f F d^iA- 

(Note that of course the left-hand-side is DGa{C) ^ A:’(r2)* applied at (p.) 

This derivative can then be expressed in various forms using standard computations, 
see Sections 4 and 5, and some interesting connections appear. 

Theorem D. All the following expressions 
{C,v)a = D^{\ogA)A{C,v) 


{C,v)a = ^ / 


t=o 


{C,C)a = Var(CA,hA) := lim - 

n^oo fl 


P n—1 
i=0 


duA 


(C) v)a = Cv dpA whenever A ^ J\f Xih ^ TaM. 


define the same analytic map A i—)■ (•, ■)a from A:’(r2) to the Banach space of symmetric 
linear 2-forms, such that {•, ■)a is positive-semi-definite with kernel equal to C for all A. 
This map induces by restriction a weak Riemannian metric on M, and then by projection 
it induces a weak Riemannian metric on Q = X{Q)/C. 

The metric {•, •)a is thus a close cousin to McMullen’s variance metric introduced 
in the context of Teichmiiller space [McMOS]^ (up a conformal rescaling by entropy), 
contains the derivative of the Gibbs map, controls the convexity of log A and extends 
the Lfi{p,A) metric on M at the same time. 

•^For an historical account of the problem, see the introduction of [BCV12] and references therein 
(among others [PP90],[Mah90],[BS01]). 

"‘See also [BCS15] by Bridgeman, Canary and Sambarino and references therein, and [PS14] by Pollicott 
and Sharp for an analogous metric of Weyl-Patterson type on spaces of metric graphs. 
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In the closing Section 8, we show a concrete example of this approach. When the 
dynamics is just the shift on the Bernoulli space {1,2}^ and the potential depends 
only on two coordinates, we exhibit the metric explicitly and we compute the curvature 
(Proposition 8.1), which is positive. In analogy with the work of McMulleen, one could 
conjecture that when our metric is rescaled by the entropy, the curvature is strictly 
negative, but we show that this is not the case. 

1.3 The optimal transportation approach to the differentiability 
of measure-valued maps 

Above we took a very common point of view, considering the Gibbs map G \ A ^ 
as taking value in (an affine subspace of) the Banach space ^(12)*, yielding an obvious 
differential structure in which each ip ^ X (hi) dehnes a “coordinate function” by i—)■ 

= f ^djUA- We call this the “affine differential structure”. 

However, this is not the only way to study the regularity of such a map, and in Section 
6 we study the “Wasserstein differential structure” aspect of the question. One can see 
G as taking values in the subset Vt{^) of T-invariant measures in the set V{Q) of all 
probability measures, and use the differential framework based on the 2-Wasserstein 
distance W 2 from optimal transportation which has been developed in the last hfteen 
years.^ This point of view proved useful in the study of the action of expanding circle 
maps near the absolutely continuous invariant measure by one of the present authors 
(see [KI 0 I 3 , KlolSa]); here, we show that with the 2-Wasserstein metric the Gibbs map 
A I— )■ is far from being differentiable even in the simplest smoothest case. 

Theorem E. Assume T = x dx mod 1 is the standard d-self-covering map of the 
circle = M/Z and ^(12) is the space of a-Holder functions for some a G (0,1]. Then 
given any smooth path [At) in X{VL), the path of Gibbs measures {pAt) is not absolutely 
continuous in {fP{yL),W 2 ) unless it is constant. 

Recall that a path in a metric space dehned on an interval I is said to be absolutely 
continuous when it has a metric speed in this is a very weak regularity, so that 

Theorem E can be interpreted as meaning that a small perturbation of the potential 
induces a brutal reallocation of the mass distribution in the sense of HA. This contrasts 
with a Lipschitz regularity result obtained for the 1-Wasserstein metric in [KLS14] (but 
note that Wi does not yield a differential structure). 

1.4 Applications to equilibrium states 

We end this introduction by presenting some applications and illustrations of our differ¬ 
ential calculus setting and the metric obtained above. 


®The full story does not fit in the bottom margin, but let us mention the important cornerstones 
which are the works of Otto [OttOl], Benamou and Brenier [BBOO], and Ambrosio, Gigli and Savare 
[AGS05]; see also [Vil03], [Vil09] and [Gigli]. 
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First we show that it is a nice framework to derive and extend to our framework 
the existence and uniqueness of equilibrium states. Consider the following optimization 
problems and induced functionals: 


h.x{p) 

:= inf logA^- i 

A(iX(0) J 

^ Adp 

for p G 'Pt(G) 

Pr(R) 

:= sup h.x{p) + 

B dp 

for B eX{VL) 


(recall that X (f2) is a suitable space of potentials —)■ M which we can chose with some 
freedom). 

Theorem F. For all B G X{VL), the supremum in the definition of Pr{B) is uniquely 
realized by ps o-nd it holds Pr(i?) = logA^. 

We show in Remark 7.5 that for the case of the Classical Thermodynamical Formalism 
in the sense of [PP90] (the shift acting on the Bernoulli space) and for any invariant 
probability p, we have equality between h.x{h) the metric entropy of p. In this case 
the pressure Pr dehned above also coincides with the usual topological pressure. We 
consider however more general hypothesis in our reasoning. We will refer to hx and Pr 
as “entropy” and “pressure” from now on. 

We then observe that the metric {■,-)a enables us to dehne the gradient of various 
natural dynamical quantities, including entropy and pressure (see Proposition 7.10). 
This gives sense to the gradient flow of the functional 

A H-)■ hxipx) X [ B dp A 


obtained by composing G with the functional dehning the pressure. This gradient flow 
has a linear form when expressed in the quotient space Q and can serve as a model 
for non-equilibrium dynamics, according to which a system out of equilibrium behaves 
just like a system at equilibrium with a varying potential (Section 7.3.2). In case of a 
mere change in the temperature of the system’s environment, this model predicts the 
physically sound property that the systems evolves only in its temperature (Remark 
7.11). 

As a consequence of Theorems D and F we obtain several results related to works by 
Kucherenko and Wolf. The hrst result, obtained in [KW14] under somewhat different 
hypotheses, is a prescription result. Given <h = {ipi ,..., (px) a tuple of test functions in 
A’(G), the “rotation vector” 



(pi,dp,.. 


ipK dp) 


of a T-invariant measure describes some convex set Rot(<h) C The result is then 
that for all base potential R, every interior value of Rot(<F) can be realized uniquely as 
the Gibbs measure of a potential of the form B + aipi + ■ ■ ■ + axPR (Theorem 7.13). 
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The second result states existence and uniqueness of equilibrium states under linear 
constraints; it is very close to Theorem B of [KW13], but even disregarding the difference 
in our hypotheses we obtain a more precise description of the equilibrium state: the 
parameter s of [KW13] is always equal to 1. In other words; 

Theorem G. Let $ = {^pi,... ,lpk) € T’(r2) be such that 0 G intRot(<h). Given any 
B G the restriction of 


Pb : p HG h;t-(/i) + J Bdfi 

to the set Vt[^] of invariant measures realizing /(p^d/i = 0 for all k is uniquely max¬ 
imized at the unique Gibbs measure in Vt[^] that is defined by a potential of the form 
B + aiifi + ■ ■ ■ + ax'PK- 

We also recover Theorem B in [KW14] (the supremum of entropy of measures realizing 
a given vector in the interior of Rot(<h) depends analytically on the vector, Corollary 
7.16), and by nature our method could be applied to more general constraints (e.g. 
asking that rv(/i) belongs to some submanifold of Rot(<h)). 

Theorem G notably shows that when T is the shift map and the test functions and the 
potential B all depend only on n coordinates, so does the potential of the constrained 
equilibrium state, which is thus a (n — l)-Markovian measure (Remark 7.17, which also 
follows from the results of [KW13] but is not stated there). 

This result is precise enough to yield explicit solutions to some concrete maximizing 
questions, which as far as we know would be difficult to solve without it. Let us give a 
toy example which turns out to have a nontrivial answer. 

Example H. Assume T is the shift map on Q = {0,1}^. Among shift-invariant mea¬ 
sures fi such that /i(01*) = the Markov measure associated to the transition 

probabilities 


P(0 ^ 0) = 1 - a P(0 ^ 1) = a 

P(1 ^ 0) = ^ P(1 ^ 1) = ^ 

where a is the only real solution to 

4 

(1 - af = —a^ (a ~ 0.487803) 

^ i 

uniquely maximizes entropy. 

As a hnal remark, we mention that optimization problems such as we solve in Theorem 
G appear naturally in multifractal analysis, see [BSOl], [BSS02], [Glil4]. Our approach 
might lead to explicit computation in that held. 
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2 Notation and preliminaries 

We shall consider the thermodynamical formalism associated with a discrete-time, con- 
tinnons-space dynamical system. The phase space shall be denoted by hi, and will be 
assnmed to be a compact metric space, whose metric shall be denoted by d. The time 
evolntion is then described by a map T : hi —)■ hi which will be assnmed to be a £nite-to- 
one map. We will denote by V{Q) the set of probability measnres on hi, and by Vt{^) 
the snbset of T-invariant measnres. 

Typical cases to serve as examples are the shift on where ^ is a finite alphabet, 
the maps x ^ dx mod 1 acting on the circle = M/Z. The reader not willing to deal 
with the detailed hypotheses below can assnme T is one of these classical maps. 

Remark 2.1. We also want to consider cases snch as the tent map 

f 2 a; if a; < ^ 

X ^ f 

2 — 2a; if a; > ^ 

on the interval [0,1]. This map has a particnlarity shared with other map of the same 
kind; one point, 1/2, has only one inverse image while its neighboring points have two. 
This will make it necessary to adjnst some of the definitions below. Let ns formalize a 
property of the tent map which we will refer to when we explain these modifications; 
the tent map has local inverse branches in the sense that for all a; G 12 there is an integer 
d > 2 (to be implicitly taken minimal), a neighborhood V oi x and continnons maps 
Uk ■ Vk C fl ^ V (where k E {I,..., d}) snch that for all x' E V we have 

T-\x') = {y,{x>),...,y,{x')}. 

2.1 Working Hypotheses 

2.1.1 Space of potentials 

The hrst set of assnmptions we make concerns the regnlarity of potentials; in designing 
the hypotheses below we tried to keep them general enongh not to rnle ont non contin¬ 
nons potentials; e.g. in some settings bonnded variation fnnctions are meaningfnl (in 
particular when T is only piecewise continuous). 

We fix for all the article a space of functions T’(12), endowed with a norm H-H, satisfying 
the following. 



(HI) A:’(f2) is a Banach space of Borel-measurable, bounded functions hi — )■ M, which 
includes all constant functions; for all f,gEX (hi) we have 

WfgW < ll/ll Ikll, 

for all / G d:’(r2) that is positive and bounded away from 0, the function log/ 
also lies in A’(r2); and for some constant C it holds 

ll/ll > C'sup |/(a;)|. 


In particular for each probability measure /i on hi, the linear form defined by / h-)■ 
J / d/i is continuous: in other words, every probability measure can be seen as an element 
of 

Note that when / G d:’(r2) is positive and bounded away from 0, 1// = is also 

in d:’(r2). 


Remark 2.2. In some circumstances, one works with a norm satisfying only the weak 
multiplicativity condition \\fg\\ < C\\f\\ Ill'll for some positive constant C. Then one 
can define a new, equivalent norm H-H' = C'||-|| which is then multiplicative. 


Example 2.3. The space IIolo(f2) of a-Holder functions (for some a G (0,1]) with its 
usual norm 


ll/ll 


sup I/(x) I 


sup 

x^yG^ 


d{x,y)^ 


satisfies (HI). When a = 1, we get the space Lip (hi) of Lipschitz-continuous functions. 
Note that d°‘ is a distance on hi, and that HolQ,(r2) coincide with Lip(r2,(i“). 


Next, we need a compatibility hypothesis between T and A’(r2). 

(H2) T preserves T’(r2) forward and baekward, i.e. the composition operator / i—)■ /oT 
is well-defined and continuous from T’(r2) to itself, and or all / G T’(r2), we have 


g:x^ f{y)eX{n) and ||^|| < C'||/|| 

T{y)=x 


for some constant C (i.e. / h-^ is a continuous operator on T’(r2)). 

Example 2.4. When T’(r2) = HolQ,(r2), it is sufficient to ask the map T to be a local 
bi-Lipschitz homeomorphism to obtain (H2). 

Remark 2.5. The tent map does not strictly speaking satisfy this compatibility when 
for example T’(r2) = HolQ(r2), because | only has one inverse image and g is usually not 
even continuous. One can fix such cases by introducing a suitable weight in all sums 
YliT(y)=x f\y)i i-®- YliT{y)=^ f hj) should be interpreted as f{l) + f{l) to ensure continuity 
in X of YliT(y)=x fiy)- other words, if needed YliT{y)=x fid) replaced everywhere 

by 'Yhk f where y^ are the local inverse branches of T. 
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2.1.2 Transfer operator 


The composition operator arising from T is the natural functional counterpart to our 
dynamical system; in fact, most properties of ergodic flavor of T are naturally formulated 
in terms of the composition operator on a certain class of functions. However it is useful 
to investigate its “inverses”, the transfer operators. Given a “potential” A G ^’(G), one 
dehnes a transfer operator (also called a Ruelle operator) by 

^A{f){x) = ^ 

T(y)=x 


note that since XiVL) is a Banach algebra, lies in ^’(G) and so does hypothesis 

(H2) also implies that ^a is a continuous operator. 

Since ^’(G) is a space of functions, it contains a canonical “positive cone”, the set of 
positive functions, which is convex and invariant by dilation. By design, the transfer 
operator is positive in the sense that it maps the positive cone into itself. Typical ex¬ 
panding assumptions for T ensure that the positive cone is even mapped into a narrower 
cone, inducing a contraction on the set of positive directions endowed with a suitable 
distance (see e.g. [BalOO]). Instead of assuming such kind of hypothesis on T, we shall 
only assume the consequences that are usually drawn from them. Namely, we ask that 
{T,A!{Q)) satisfies a Ruelle-Perron-Frobenius theorem (including a spectral gap) in the 
sense of the following two hypotheses. 

(H3) For all A G ^(12) the transfer operator has a positive maximal eigenvalue 
and a positive, bounded away from 0 eigenfunction Ha G ^’(G): 

^A^blA) = ^AhA, 

and the dual operator of ^a preserves the set of hnite measures and has a 
eigenmeasure va ^ "P{^) for the eigenvalue A^i, in particular 

j ^aU) duA = \Aj fduA 'if G T(G) 

Observe that when all functions of ^(0) are continuous, ^a extends to all continuous 
functions and then the dual operator automatically acts on measures. 

(H4) For all A G ^(0), there are positive constants D, S such that for all n G N and 
all / G ^(0) such that J f dvA = 0, we have 


It follows in particular that A^i is a simple eigenvalue and that va dehnes a natural 
(topological) complement to its eigendirection. 
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It is easy to see that /xa = hA defines an invariant measnre for T, and np to 
normalizing Ha we can assnme fiA is a probability measure which we will call the Gibbs 
measure of A. 

Example 2.6. When T is expanding in a relatively general sense and A:’(r2) is a space 
of Holder functions, (H3) and (H4) are proved in [KLS14] (the spectral gap is proved 
there for normalized potentials only, but see remark 2.9). 

2.1.3 Further hypotheses 

Our first results will only use (HI) to (H4), but at some point we will need two further 
hypotheses, which feel harmless (in the sense that they hold for most if not all relevant 
examples), but which do not follow from the previous ones. 

From Section 5 on, we will assume: 

(H5) For all A, f G d:’(r2), if / is non-negative and J f d^A = 0 then / = 0. 

Remark 2.7. If all functions in A:’(r2) are continuous, it is sufficient to ask that /i^ has 
full support for all A to ensure (H5). 

Example 2.8. Assume that T is continuous, that all functions in X{VL) are continuous, 
and that the only closed subsets A C which are both forward and backward invariant 
(i.e. T(A) = T~^{A) = A) are the empty set 0 and the full space hi. Then (H5) holds. 

Indeed, since fiA is an invariant measure, its support is a closed invariant subset 
of n. But (assuming without lost of generality that A is normalized, see below) the 
invariance under and the fact that e^ is a positive function also implies that supp fiA 
is backward invariant, so that fiA must have full support. The continuity of / then gives 
the conclusion. 

In Section 7 we will use the following largeness hypothesis, meant to avoid degenerate 
cases such as A’(r2) = {constants}. 

(H6) All continuous functions / : hi —)■ M can be uniformly approximated by elements 
of A(fi). 

(Note that we do not imply here that the functions in A (hi) are continuous themselves.) 

2.2 Normalization 

Among the potentials, of particular importance are the normalized ones, i.e. those 
potentials A such that JfA(^) = 1 (where 1 denotes the constant function with value 1) 
i.e. such that Aa = 1 and Iia = 1- In other words, A is normalized when 

^ e^(p) = 1 Vx eQ. (1) 

T(y)=x 

Two nice properties that give a first evidence for the relevance this definition are that 
when A is normalized, first JfA is a left-inverse to the composition operator; 

^a(/oT) = / V/GA(fi), 
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second preserves the set of probability measures. One can then interpret ^ 

Markov chain, the numbers representing the probability of transiting from x to y 
whenever T{y) = x; a realization of this Markov chain is a random reverse orbit of T. 

As is well-known, the Ruelle-Perron-Frobenius theorem enables one to “normalize” a 
potential A, by writing 

B = A + log Ha — log hA°T — log € df (O) • 


Then one gets 


■^bU) ■ X ^ 


T{y)=x 


hA{y) ^ 

\AhA{x) 


■^bU) 


1 

\AhA 


^A{hAf) 


where (H3) ensures that Ha is bounded away from 0 and (HI) then ensures that l/h^ € 
^(0). The transfer operators ^a and are thus conjugated one to another up to a 
multiplicative constant A^, the conjugating operator being the multiplication by Ha] in 
particular 


=Sfs(l) 


1 

\AhA 


^A{hA) 


1 . 


This conjugacy shows that the Gibbs measure y,A = va is also the eigenprobability 
^B = yB of each potential yields an invariant probability measure, but several 
potentials can yield the same Gibbs measure. 

Using the same computation than above, one sees that whenever two arbitrary po¬ 
tentials A, B are related as above, i.e. B = A + g — goT + c for some g G ^’(G) and 
c G M, then their transfer operators and their duals are conjugated one to another up to 
a constant: 

^b(-) = 


where e® is in ^’(G), positive and bounded away from 0. It follows immediately that (up 
to normalizing constants) hs = hAe~^, vb = and \b = e^Ayi. In particular we have 
yB = yA'- both potentials dehne the same Gibbs measure. It is also straightforward to 
check that if moreover both A and B are normalized, g must be a constant and c must 
be zero, so that A = B. In other words, we have a subspace 


C=\^g — goT + c\ gE ^(11), c a constant} C ^(11) 

such that each class modulo C dehnes one Gibbs measure, and contains exactly one 
normalized potential. One says that a function of the form g — g o T is a coboundary, 
thus C is the space generated by coboundaries and constants. All the above is very 
classical; our goal is now to study in more details the following objects: 

• the set Af C ^(0) of normalized potentials (which is not a linear subspace, see 
Remark 5.4), 
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• the normalization map 


N : A A + log liA — log hA°T — log 


from A:’(r2) to A/”, 

• the quotient Q = X{VL)/C, and 

• the Gibbs map G \ A ^ p ,a from A’(r2), seen as taking value either in fr:’(r2)* or in 
VriSl) C ViSl) 

The typical questions we want to answer are of differential-geometric flavor; is A/” a 
submanifold of X{Q)7 Are the maps N and G differentiable? How to endow Af or Q 
with a meaningful Riemannian metric? Can we then study gradient flows of natural 
functionals on these spaces? 

Remark 2.9. The conjugacy between the transfer operator of a potential A and the 
transfer operator of its normalization B = N{A) shows that a spectral gap for 
implies the same spectral gap for ^a (with a different constant iA, but the same 5). 
Indeed, if f f duA = 0 then f f/liAdpA = 0 and 

\\jr^(m = \KhA^S{f/hA)\\ 

< K\\hA\\D(i - srWflhA 

< (ciiAAiiiii/ft4ii)A:5(i-irii/ii. 

In particular, if hypothesis (H3) is satished, the spectral gap for normalized potentials 
implies (H4) for all potentials. 

Remark 2.10. The spectral gap hypothesis implies an exponential decay of correlation 
for functions in A’(r2); indeed if A is any potential and f,g & ^(12) are such that 
f fdpA = 0, we have 


f ■ goT'^ dp A 


j ^J^^A)U-9oT^)dpA 

j ■^N(A)if) ■ 9 dp A 


< \\-^NiA)if) ■ 9\\oo 

< G-^^-^^^{f)\\\\g\\ 

<G-^D{l-6r\\f\\\\g\\ 


Remark 2.11. In typical situations, a normalized potential A can be recovered from 
the Gibbs measure as a Jacobian: for example, if T has inverse branches pi near each 
X E Q which are local homeomorphisms with disjoint images, then 


dpA{x) 
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in the sense that li B = B{x,e) is a small ball around x, the ratio of HA{yi{B)) with 
respect to yA{B) goes to when e goes to zero (in doubt, integrate caracteristic 

functions of balls with respect to one measure and change variables to verify such claim). 
Slight adaptations of this argument are needed for example for tent maps. 

In general, it might a priori happen that two different normalized potentials A, A' 
have the same Gibbs measure. We will see much later in Remark 7.4 that our assump¬ 
tions are sufficient to prevent this, and ensure perfect identihcations between normalized 
potentials, mod C classes of potentials, and Gibbs measures. 

2.3 Analytic maps and submanifolds 

When working in inhnite-dimensional spaces, just as differentiability has various deh- 
nitions of varying strength (Gateaux versus Frechet), the analyticity of a map can be 
dehned in several ways. Here, we take the strongest dehnition, recalled below. 

First, recall that a closed linear subspace M in a Banach space X is said to be 
topologically complemented, or for short complemented, when there is a closed linear 
subspace N which is an algebraic complement. We shall only write X = M ® N when 
M and N are topological complements. The projection to M along N and the projection 
to N along M are then continuous, i.e. for all x & X, the decomposition x = m + n 
with m E M and n E N exists, is unique, and m and n depend linearly continuously on 
X. Equivalently, M is complemented when it is the image, or the kernel, of a continuous 
linear projection X ^ X. 

Let X and 3^ be two Banach spaces, whose norms will both be denoted by H'll- A 
continuous, symmetric, multilinear operator a : X^ —)■ 3^ has an operator norm denoted 
by |a|; if is a vector in X, we denote by fhe element () of X^ and we 

have 

l|a(C'''>)ll < lolllCir. 

We shall say that a sequence Ok : X^ —)■ 3^ of such k-aiy operators {k > 0) is a series 
with positive radius of convergence if the complex series 

\ak\z^ 

k>0 

has a positive radius of convergence in C. 

Let ^ : U G X ^ y he & map dehned from an open subset of X. We say that $ is 
analytic if for each x E U there is a series of fc-linear, symmetric, continuous operators 
Ok : X^ -E 3^ with positive radius of convergence such that on an open subset of U the 
following identity holds; 

k>0 

An analytic map is smooth (in particular, Frechet differentiable and locally Lipschitz- 
continuous) and the operators Ok are uniquely dehned by <h. Most classical results hold 
in this context, in particular the inverse function theorem and the implicit function 
theorem (see [Cha85] and [Whi65]). 
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More precisely, a map $ : t/ <Z X ^ y which is analytic and such that '■ X ^ y 
is a topological isomorphism for each x, has a local reciprocal near each point, which 
is itself analytic (inverse function theorem); if a map F : U G X ^ y is such that 
F{x) = 0 for some x & U, DF^ is onto y and DF^ is complemented in X, then 
the level set F“^(0) is an analytic submanifold of df in a neighborhood of x (implicit 
function theorem). In particular, this means that there is an analytic diffeomorphism 
dehned in a neighborhood of x that maps F“^(0) to (an open set of) a complemented, 
closed, linear subspace of X; it also means that F“^(0) can be locally written as the 
graph of an analytic map over a complemented subspace. 

3 Normalizing potentials 

We will now consider the set of normalized potentials 

Af={AeX\ ^a(I) = 1} 

and the normalization map N that sends any potential A to its normalization; 

N{A) = A — log Aa + log Ha — log Ha ° T. 

The map N can be described as the (non-linear) projection on M along 

C=\^g — goT + c\gE d:’(r2), c a constant}. 

We start by a simple Lemma which will both prove useful and serve as an example of 
the use of convergent series in our study. We shall denote by ker fiA C d:’(r2) the kernel 
of fiA seen as a linear form, i.e. 

kerpA := |/e T’(n) J f d^A = 

Lemma 3.1. If A is a normalized potential, the operator I — ^a is onto ker pA? o-nd its 
eorestrietion to ker pA has a eontinuous inverse given by 

OO 

{I-^a)-^ = Y,-^a ■■ kerpA^Xin). 

k=0 

Proof. For all / G d:’(r2), we have that 

j j /d(i?>^)=0, 

because pA is fixed by It follows that / — ^a takes its values in kerpA- 
For all / G ker pA and all n, we have 

{I - drA)[ydrX(f]\^ S - 

A k=o ' 
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By the spectral gap assumption, we obtain that converges, that it is bounded 

t'y tII/II) ^'^cl that go®® to 0 when n —)■ cx). We deduce that Y1^=o-^a 

well-dehned and a right-inverse to / — ^a-, which is therefore onto ker 
By commutation the above shows that, for all / G d:’(r2), 

n 

k=0 

converges to /, so that we have dehned an inverse to (a corestriction of) I — ^a- D 

This has useful consequences, which will be better phrased by introducing another 
operator related to A. 

Definition 3.2. Given any normalized potential A, let ^a be the continuous linear 
operator on d:’(r2) dehned by 

OO 

■^Aif) = -(/ - ^a)-' O ^AJa) = -Y^^aUa), 

k=l 


where fA:=f-Jf d/iA- 

Observe that ^a maps ker pA to itself, so that ^a is indeed well-dehned and takes 
its values in kerpA, and that ^a commutes with ^a- 

Proposition 3.3. Let A be a normalized potential. Then: 


1. given f E C, there is a unique deeomposition f = g — g oT + c with g G ker pA and 
c a eonstant, given by 


c = 



and g = ^A{f), 


2. the subspaee C is elosed in X{VL), so that Q = X{VL)/C inherits a Banaeh spaee 
strueture from H-H, 

3. ker.ifA and C are (topologieal) eomplements in X, 

4- ^A maps C onto X{VL). 


Proof. First observe that given any decomposition f = g — g oT + c and any T-invariant 
probability measure p, we have 


/d/i= / gdfi- / g d(T#/i) + c = c 


where T^pa is the usual pushforward of the measure pA with respect to T. Since pA is 
invariant, it follows that c must equal J f dpA and is uniquely dehned. 
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Let us then check that any / G C fl ker^^ must vanish. First, it is easy to see that 
ker C ker fXA'- 

j /d/iA = j fd{^XiJ,A) = j ^A{f)dnA = 0. 

It follows that we can write f = g — g oT, so that 

0 = ^A{g - goT) = ^A{g) - g 

and g is an eigenfunction of ^a for fhe eigenvalue 1. Therefore g is constant and / = 0. 

To prove 1, we write j = g\ — g\oT ^ cioY some g\ and with c = f f dfj,A- Setting 
g = gi — f gi dg,A, we still have f = g — g o T + c and 

■^aU - c) = ^A{g - goT) = ^A{g) - g 

since ^a is a left-inverse to the composition operator. Now, from g G ker fiA it follows 
g = —{I — ^A)~^-^Aif — c) = ^Aif), as claimed. 

To prove 2, consider a sequence of functions fn & C which converges to / G T’(r2). 
Then using 1, we can write /„ = gn — dn ° T + Cn where 5 '„,c„ are images of /„ by 
continuous operators. In particular gn and c„ have limits g G A’(r2) and c G M, so that 
f = g- goT + ceC. 

To prove 3, since we already know that ker.^A and C intersect trivially, we consider 
any / G T’(r2) and let c := f f djUA and g = ^aU)- We have 

■^A{.g - goT + c)= ^A{g) - g + c=S^a(1) = -^aU -c) + c = ^aU) 

where the second equality follows from {^a — I)^a = -^A on ker/iA- It follows that 
i := f — [g — g o T + c) is an element of ker^A- The decomposition 

f = e+ig-goT + c) 

shows that T’(r2) = ker..SfA + C and since both spaces are closed, ker.^A and C are 
complements. 

To prove 4, let / G A’(r2) and set c = J f dfXA and g := {I — JfA)~^(c — /). Now 
g — goT + c is an element of C, and we have 

■^A(g -goT + c) = JfA(g) - g + c 

= —{I — ^A){g) + c 
= /- c + c = /. 


□ 


We are know in a position to prove our hrst main result, that M is an analytic sub¬ 
manifold of T’(r2). This result might be known, but we did not hnd a clear statement in 
the literature, related statements are often framed into a weaker dehnition of analytic- 
ity, the identihcation of the tangent space seems new, and we obtain the result without 
resorting to complex analysis as usually used to prove the regularity of the eigendata of 
operators (see appendix V in [PP90], where the weak dehnition of analyticity should be 
noted, and also section 3.3 in [BSOl]). We shall in fact deduce from Theorem 3.4 that 
the leading eigenvalue and positive eigenfunction of ^a both depend analytically on A. 


17 



Theorem 3.4. The set M of normalized potentials is an analytic submanifold of X{Q), 
and its tangent space at A & M is TaM = ^a- 

Proof. This is a direct consequence of the implicit function theorem. 

Let F : ^’(n) —)■ ^’(n) be the map dehned by 

F{A){x) = .^a{1){x) = 

Ty=x 

Then F is analytic, as follows from the analyticity of the exponential; 

F(.4 + 0 = 5^ ^'■^.4(0. 

fc >0 

where 

Ty=x 

dehnes a series of continuous, symmetric /c-linear operators with inhnite radius of con¬ 
vergence (note that we use here the assumptions that T’(r2) has multiplicative norm, 
and that X]rj/=xC(l/) is iii T’(r2) for all C G T’(r2)). 

Now, given a potential A and a vector C, both in T’(r2), we have 

DFAi.0 = 5 ] e-'^x'Av) = 

Ty=x 

SO that DFa = .^Ai since we know from Proposition 3.3 that ker..Sf^ is complemented 
and .^A is onto T’(r2), we can apply the implicit function theorem. □ 

We also get directly the analyticity of the normalization map as explained in the last 
paragraph of Section 2.3 

Theorem 3.5. The normalization map N : T’(r2) —)■ Af sending a potential to its 
normalized version is analytic. Moreover, its derivative DNa at a point A G T’(r2) is 
the linear projection on T/v(yi)A/' = ^n(A) in the direction of C. 

Proof. See hgure 1 for a general picture of the various maps involved. Let If ; T’(r2) —)■ Q 
be the quotient map; it is a continuous linear map, and in particular it is analytic. Its 
restriction ni^v" to the submanifold Af is therefore an analytic map, and we have for all 

AeAf: 

D{A\Af)A = = IIlkerifA- 

Since .^a and C are topological complements, this differential is invertible with con¬ 
tinuous inverse. The inverse function theorem then ensures that 

is well-dehned and analytic. We get the desired result by observing that 

7v = np;on. 

□ 
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Figure 1: Potentials and Gibbs measures 


Corollary 3.6. The maps A : A:’(f2) —)■ M and H : A’(f2) —)■ A:’(f2) sending a potential to 
its leading eigendata, i.e. defined by 

A(A) = and H{A) = 

(normalized by the condition \oghA € ker/xyig for any fixed Aq) are analytic maps. 
Moreover for all A, ^ G df (G) 

B(logA)j(C) = j 

Note that it will turn out that in our framework log A equals the pressure functional, 
so that this result gives also the derivative of the later. 

Proof. Fix Af) be any potential, which can be assumed without loss of generality to be 
normalized. We then have 

A(A) = exp ( y (A — A^(A)) dfiAo) H{A) = exp (^.J^Aoi^ ~ ^(^)) 

which are analytic as composed of analytic maps. 

Differentiating log A (A) = f (A — iV(A)) d/UAg with respect to A it comes 

D(logA)^(C) = J{C-DNA{0)dfrAo- 

This holds for any Aq and any A, in particular taking Aq = A and observing that 
DNa{C) ^ ker/iyi yields the desired formula. □ 

Corollary 3.7. The map G \ A ^ piA & X{VL)* is analytic. In particular for each 
ip G X{VL), the map : A i— >■ J pdfiA is analytic. 
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Proof. Corollary 3.6 implies that ha = .^(logA)^ as a linear form defined on A:’(r2), so 
that the Corollary follows from the analyticity of log A. □ 

At this point we have proved Corollary B from the introdnction. 

Corollaries 3.6 and 3.7 where obtained nnder different assumptions and with differ¬ 
ent methods by Bomhm, Castro and Varandas [BCV12]; note that we notably do not 
assume the high-temperature regime (see their conditions (P) and (P’)) and that once 
our framework is set, our proofs are very simple. 


4 Differentiating the Gibbs map in the affine 
structure 


There are at least two ways to endow the set of probability measures V{Q) with a kind 
of differential structure, i.e. to dehne what it means for a map such as the Gibbs map 
G : A I— )■ /i^ to be differentiable. In this section, we consider the affine structure, while 
in Section 6 we will consider the Wasserstein structure. 

The affine structure is obtained simply by observing that 'P(fl) is a convex set in 
^(1])*; “coordinates” are obtained by looking at integral of test functions, so that G is 
often considered to be differentiable if 


WAX,pex{n)-. - 


ip dpA+tc, 


t=0 


exists. 


We will adopt here the dehnition of Frechet differentiability for G : ^’(n) —)■ 

It is stronger than the above one in three respects: we ask that for each ip the directional 
derivatives at A can be collected as a continuous linear map ^’(n) —)■ M, that all these 
linear maps for various (p can be collected as a continuous linear map ^’(n) —)■ 
and that in the Taylor formula defining the derivative, the remainder is of the form 
odlv?!!IICII) (when C, —)■ 0). Note that at this point this strong dehnition is already 
ensured by the analyticity of G and we only want to get an explicit formula. 


Theorem 4.1. For all A G ^(12) there is a neighborhood U of 0 in ^(12) such that for 
all ip G ^(12) and all ( E U, we have 


j ^(^PA+c- j ip dpA = j {I - AFn(A)) ^(<^a) ■ A'A^A(C)d/iA + 0(||<^i|||Cin 

where ipA ■= F — f pdjUA is the projection of ip on ker/i^ along the space of constants. 

Implicitly, the constant in the O depends only on A (and of course U, 12, T, ^(12)) but 
not on ip and (. This result will be deduced from the following special case where the 
expression is simpler. 

Theorem 4.2. Assume that A is normalized, ip has mean 0 with respect to pa, o-nd ( 
is tangent to M at A and small enough. Then: 

I pdpA+i: = J {I -^A)~\T)-CdpA + 0{\\ip\\\\Cf). 
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Writing G^p the composition of the evaluation at ip and the Gibbs map, i.e. G^p{A) = 
J pdpA, the above formula can be recast as: 

D{Gp)a{C) ^ j~ ■ C d/iA when AeX, ( e ker^A, ^ e kerpA- 

Observe that using the series expression of (/ — A^a)~^, that pa is hxed by and 
that the transfer operator is a left-inverse to the composition operator, this also rewrites 
as 

+00 „ 

i=0 

This version has the advantage that it applies to test functions p not necessarily in 
kerpA, because ( G ker^A implies ( G kerpA and adding a constant to p does therefore 
not change the value of the integrals. 

We can rephrase Theorem 4.1 in a similar way, which will be used in the sequel to 
dehne a metric on ^’(O). 

Corollary 4.3. For all A,C,,p & X{VL), if ( is small enough we have 


p dpA+c ~ j ^ = 

n OO n 

/ pAC,dpA + ^ / (<^A-Co^* + <^AoT*-C)d/iA + 0 (||(p||||C|P 


2 = 1 


where the above sum converges and defines a continuous bilinear form. 


4.1 The case of a pair of normalized potentials 

To obtain Theorem 4.2, thanks to the regularity of the normalization map proved in the 
previous section, we are mostly reduced to estimate f pd{fj,B — /^a) when p G T’(r2) is 
hxed and A, B are normalized potentials. Up to adding a constant to p, which does not 
change the value of the above integral, we assume that p G ker pA- 
We hrst write (using that pA and ps are respectively hxed by 

j p d{pB - Pa) = j dpB - J AFa{p) dpA 

= j {.^b{p) - ^a{p)) dpB + j ^a{p) d{pB - Pa) (2) 

Then, we observe 

{aFb{p) - jFa{p)){x) = - 1 ) 

T(y)=x 

= FFA{p{e^~^ - l))(a:). 
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so that writing R{x) = e* — 1 — a; ~ we get 


^b{^) - ^a{^) = ■ {B - A)) + ■ R{B - A)). 


Thus: 


(p d{iJ,B - Ha) = J ^a{<P ■ {B - A)) d/is + j ^a{p> ■ R{B - A)) d/i^ 

+ j ^A{p>)d{HB - Ha) 

= j ^a{.^ ■ {B - A)) dHA + j -^Ai^ ■ {B - A)) d{HB - Ha) 

+ j ^a{p> ■ R{B - A)) dHB + j A^a{(p) d{HB - Ha) 

(p d{HB - Ha) = j p>-{B - A) d/i^ + j ^a{p>) d(/is - Ha) 

+ d^{^,B) 


(3) 


where J^{<p, B) = f ^a{}P ■ {B — A)) d{HB — Ha) + / ■^Ai'P ■ R{B — A)) d/is, which is 
linear in (p and which we now aim at bounding by a multiple of ||</9||||-B — 

A hrst tool is the regularity of G. 

Lemma 4.4. The map G : ^’(n) —)■ A’(r2)* is locally Lipschitz: for all A G ^’(n) there 
exist a neighborhood U E X (hi) of A and a constant G such that, for all (p E X (hi) and 
all B E U, it holds: 

[ ipd{HB-HA)\ <(^ 11 (^ 11115 -^ 11 . 


Proof. This follows from the analyticity of G obtained in Corollary 3.7. 

A second observation is that since ^’(n) has a multiplicative norm, we get 


□ 


||fi(B-.4)|| = 


k>2 

k>2 

= R{\\B-A\\) 

< G'\\B - A||2 

when B is in any hxed neighborhood U of A. 

Now, since H-H is assumed to control the sup norm and hb is a probability measure, 
whenever 5 G 17 it comes 

|^(^,5)| < G\.^a\M\\B - A|P + G"\^a\M\\R{B - A)|| 
<C'"||7^||||i?-7l|p 
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Now, applying (3) to its own second term repeatedly and recalling that goes 

to zero thanks to the spectral gap assumption, we get 


(pd(/iB - IJ^a) = 


ip-{B - A) dp A + j ^Aiv’) d{pB - I^a) 

f ^{(p,B) 

{(p + -{B - A) dpA + j d{pB - f^A) 

f ^{(p + ^a{^),B) 

-(b-a) diM + y{J2J^nA,B'' 


n>0 


n>0 


= (I- ^a)-\^) -(B-A) dpA + 0(11^11115 - Ain 


(4) 


which is almost Theorem 4.2, except for the assumption that B is normalized. 


4.2 End of proofs 

Proof of Theorem f.2. Since N is an analytic projection to M (i.e. N restricted to M is 
the identity), we have 

7vn+c) = A + c + o(iicip) 

for all A G A/” and all small enough C, G TaM = ^a-, with an implicit constant only 

depending on A. 

Fix A e Af, ( & ker.ifyi and ip G kerp^, and set B = iV(A + C)- Using (4) with the 
normalized potentials A and 5, we get 

f <^d(fiA+i; - fiA) = J ipdipAH ~ I^b) + j pdipB - t^A] 

= o(m\\a + <:-n(aao\\) 

+ Jo - yA)-U.A ■ (B - A)dBA + Odlt^llllB - A||2) 

= j(i-jrArHA-C'iyA + o(M\Kf), 


for C small enough, and with an implicit constant that depends only on A. □ 

Proof of Theorem f.l. Let A, C, (p G ^((2) be arbitrary. Then we consider; 

• N{A), which is the normalized potential such that Pn{A) = Ta, 

• DNa{C), which is the projection of ( on keT^N{A) in the direction of C, 

• ipA = (p- J T dpA e kerpA 
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and we apply Theorem 4.2 to this new potential, tangent vector, and test function. We 
obtain exactly the desired expression once we notice that 

l/^A+C - IJ>NiA)+DNA(0\ = If'NiA+O “ fJ>NiA)+DNA(o\ 

= 0{\\N{A + C)-N{A)-DNa{C)\\) 

= o{\m- 

□ 


Proof of Corollary f.3. We have to rewrite 

j{I- ^NiA)r^PA ■ DNa{ 0 dfiA- 

We hrst observe that the hnal expression we aim for only involves A through the measure 
ycA, so that we can as well replace A by N{A), i.e. assume that A is normalized (this 
has for sole purpose to avoid writing a dozen times ACi^(^a))- 

We hrst write {I — ^a)~^Pa = Yhiyo-^APA, and recall that DNa{C) is the projection 
of C to ]^eI^A along C; this means that there is a function g G A:’(r2) such that 

DNA{C) = CA + g-goT (5) 

(where (a = ( — f ( djUA G ker jua) and that ACa{DNa{C)) = 0- la particular, we have 
^a{DNa{C)) = 0; thus, 

g = ^a{DNa{0 - Ca) = -^a{0 = 

i>l 

This leads us to 


iI-^A)-WA-DNAiC) dfiA 

= f -^aPa ■ Ca dfiA + [ -^aPa ■ g dfiA “ / -^Xpa ■ goT dfiA 

i>0 i>0 i>0 

= [ Y,ACX^A-CAdfiA+ j Y^^XPA-gdfiA- f ■^Ct^PA ■ g dfiA 

v-\ri 


f -^Xpa ■ Ca dfiA + f 

i>0 


ip A ■ g dfiA 


[ -^aPa ■ Ca d/iA + f Pa'YI -^aCa dfiA, 

i>0 i>l 


and, using the invariance of fiA under 


{I-^a)-'pa-DNa{C) dfiA 


PaCa dfiA + Z/ {pa ■ Ca°T" + Pa^T'" ■ Ca) dfiA, 


i>l 
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where the sum converges (exponentially). 

Finally, we observe that there is no use normalizing both ip and C, since for example 
/ d/XA = / V^aC d/iA- All can therefore be replaced by and we get the desired 
formula. □ 


5 A Riemannian metric on the space of normalized 
potentials 

The goal of this section is to dehne and to study a (weak) Riemannian metric on the space 
of Gibbs measures. More precisely, we construct a Riemannian metric on the manifold 
of normalized potentials, which corresponds equivalently to a Riemannian metric on the 
quotient space Q = X{VL)/C, and relates in various ways to dynamical quantities. After 
a conformal rescaling by the metric entropy, this metric is very closely related to the 
metric defined by McMullen [McM08] (see also [BCS15] and references therein). 

5.1 Weak and strong inner products on Banach spaces 

Consider a positive symmetric bilinear form {•, ■) on some Banach space 3^. There are 
two possible definitions of positive-dehniteness. The first one is a copy and paste of the 
finite-dimensional definition, that is, we ask that 

{y,y)>^. 

In this case, one says that {•, ■) is weakly positive-dehnite. The second one is to ask that 
the Banach norm H'll of 3^ controls (•, •) from below, that is, 

3G>0,V|/e3^; {y,y)>C\\y\\^ 

In this case, one says that {•, •) is strongly positive-dehnite; note that this condition 
implies weak positive-dehniteness. 

Most of the time, one is only interested in bilinear forms which are continuous with 
respect to the Banach topology of 3^. But if (•, •) is both continuous and strongly 
positive-dehnite, then its associated norm is equivalent to H'll, and in particular 3^ must 
be isomorphic to a Hilbert space. Therefore, most Banach spaces have no continuous, 
strongly positive-dehnite inner product. 

We shall say that {•, •) is an inner product if it is continuous and weakly positive- 
dehnite, and use the term semi-definite inner product for a merely continuous, positive 
semi-dehnite symmetric bilinear form. By a Riemannian metric on a smooth Banach 
manifold, we mean a held of inner products on the tangent spaces, such that when 
translated in a chart, the inner product depends smoothly on the point, that is, it 
dehnes a smooth map from the domain of the chart to the Banach space of symmetric 
bilinear forms. 

As a last remark, note that when (•, ■) is an inner product inducing a complete norm, 
it endows 3^ with a second structure of Banach space (more precisely a Hilbert structure 
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of course). Then the the identity map 3^ —)■ 3^ is a continuous bijection between the two 
Banach structures at hand, and is therefore an isomorphism. This implies in particular 
that (•, ■) is strongly positive-dehnite. In other words, inner products which are not 
strongly positive-dehnite induce a norm which is never complete. This means that there 
will be a relatively subtle interplay between the topology of y and the measurements 
made from (•, •)• 


5.2 The Variance metric 

Now, we introduce our proposed metric. Its main properties are summed up in the 
following result. 

Theorem 5.1. There exists an analytic map from ^(12) to the space of its continuous 
symmetric bilinear forms, which maps any potential A to a semi-definite inner product 
{■,-)a such that: 

1. (•, ■)a restricts to T^M into an inner product for all A G Af, thus inducing a 
Riemannian metric on Af, 

2. this Riemannian metric coincides with the one obtained from L^{iaa): 

WA e Af,Wr],C ^TaAA : {r],C)A = j vCdpA, 


3. for all A, (-, ■)a induces a well defined inner product on Q, thus inducing a Rie¬ 
mannian metric on this quotient space, 

4- for all A,(4,ip E X{VL), it holds: 


= 
t= 


5. for all A,<4 E Xiyi), it holds: 


Var(CA,/iA) := lim- / d^A = {(,()a- 

^ i=0 

Of course, the metrics in Af and X{VL)/C correspond one to the other through the 
natural identihcation between these two spaces. There is really only one Riemannian 
metric, which can be viewed in two ways. Any of the last two items completely specify 
(•, ^Ai and can be taken as a dehnition. Our point here is that these expressions dehne 
the same bilinear form, inducing an inner product on TaAA. 

The end of this section is devoted to the proof of Theorem 5.1. The hrst step leading 
to this result is to observe that the expression in Corollary 4.3 is symmetric: in the 
right-hand side, ( and ip play the same role (up to normalization, but the formula holds 
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and was even proved with ( also normalized to Ca)- This means that the function against 
which fiA is tested and the direction in which A is moved play precisely the same role in 
the evolution of the integral, a somewhat surprising connection (but which also follows 
from Corollary 3.6 and the Schwarz Lemma). It also indicates that the right-hand side 
in Corollary 4.3 dehnes a symmetric bilinear form; given A G A:’(r2), we dehne for all 


{v, C)a = Va-Ca duA + ^ / (r/A ■ Ca o T* -h r/A O T* ■ Ca) d/XA 

i>i 

= [ Va-Ca d/iA + E / ■ (a + m ■ J^kA)(U)) it^A 


i>l 


The second expression shows that ( 77 , C)a is a well dehned number, and that it dehnes a 
continuous symmetric bilinear form on T’(r2). It also follows from Section 4 that (•, •)a 
depends continuously on A; in fact the analyticity of {•, •)a follows from Corollary 3.6 
in the same way as Corollary 3.7; we have {C,v)a = -D^(logA) a(C) h) which depends 
analytically on A. 

From Corollary 4.3, we see that 

iv, C)a = J~ ■^n{A))~\va) ■ DNa{C) d/iA, (6) 

which seems asymmetric but will be useful. 

As it is, (•, •)a does not dehne an inner product, because it is not weakly positive- 
dehnite. 


Proposition 5.2. The symmetric form (-, •, )a is positive semi-definite, and for all A G 
Af and C G TaA/” we have {(, C)a = / d/iA. Moreover given A, ^ G Xifil), the following 
three statements are equivalent: 

{C,v)a = 0, for allr] e X{n), 


2. (C,C)a = o, 

3. CeC. 

Proof. First, observe that when A E Af and ( ^ TaA/” we have Ca = C ai^d .^a{C) = 0, 
so that; 

(C, Qa = f (.A <A iHA + J2 f AA<a) -Ca + Ca- i?i(a)) dMA^ / C • C dMA 

J .^1 J J 

It is clear that 1 implies 2, and (6) shows that 3 implies 1. Let us show that 2 implies 
3. Since (•, Ca does not change if we add a constant or a coboundary to A (i.e. it only 
depends on /ta), we can assume that A is normalized. 

Suppose that C is isotropic, i.e. (C, C)a = 0 . By Proposition 3.3, we decompose 
C = + /, where C' ^ ker.ifA and f E C. Then, {(',(')A = (C,C)a, since C is in the 
kernel of Thus, we get 0 = {C,C')a = / C^^d/iA. It follows that C^ = 0 and 

C = /eC. □ 


27 



The last line of this proof is where we use (H5). 

Definition 5.3. Let A G A:’(r2) be any potential and [A] G Q = X{il)/C be its class 
modulo C. For all [ 77 ], [C] G Q, we dehne 


{[v], [C])[A] = {v,Oa, 

which is well-dehned by Proposition 5.2, i.e. does not depend on the chosen representa¬ 
tives in each class. If A G A/”, we still write (■, •)a for fhe restriction of this inner product 
to TaM. Proposition 5.2 shows that both these products are weakly positive-dehnite, 
and thus induce a norm on the Banach space they are dehned on (Q and T^A/” = ker^A, 
respectively). We denote both norms by II-IIa, i-e. 

liciu = vTW. 


and we use this notation for general ( G A:’(r2). 

Let us now prove the last statement of Theorem 5.1. As usual, we can dehne the 
variance of a function in ker /i^ by 

Var(CA,/iA) ;= lim- / d/in- 

^ •' i=0 

By direct computation, we obtain 

/ n 2 n—1 „ 

(J^CaoT*) dfiA=Yl CAoT-CAoT^dfiA 

' f Ca + 2 ^ [ 


i,j=0 ' 

= n 


a-CAoT^-M/i^ 


0 < 2 <j<n —1 ' 

n—1 


= n 


c\ d/iA + 2 ^(n -k) / Ca-Ca°T^ d^A- 


k=l 


Assume without loss of generality that A is normalized. Then, 

n—1 P, « n—1 


f (5^CAor)'d/XA= [ CldfiA + 2j2 [ CA-CAoT^d/XA 

^ i=0 ^ k=l ^ 

2 ^ r 

--J2 kJ^KU) ■ CaAi^a, 

n J 

7,_1 


where the last term is bounded in norm by with 5 the spectral gap 

of^A. 

It follows that for all A, ( we have 


Var(CA,/iA) = IICIIa 
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which, as usual in common examples and as follows from Proposition 5.2, vanishes 
exactly when ( G C. 

This concludes the proof of Theorem 5.1, and of Theorem D once one observes that 

D^(logA)A = -DGA = (;-}A. 

Remark 5.4. We have thus recovered in our setting the convexity of log A, which will 
be given - as is customary- an interpretation in term of “pressure” below. 

A notable consequence of this is that the submanifold Af does not contain any straight 
interval; it is contained in the zero level set of log A, so that any straight interval in Af 
passing through A would have its direction in TaA/ and in the kernel of whose 

intersection is trivial. 


6 Regularity of the Gibbs map: the Wasserstein 
structure 

The development of optimal transportation and more precisely of the 2-Wasserstein 
distance has let an alternative differential structure for the set 'P(fl) emerge, notably 
driven by the work of Otto [OttOl], Benamou and Brenier [BBOO] and Ambrosio, Gigli 
and Savare [AGS05]. We shall rely on the formulation given by [Gigli], which allows to 
define the differentiability of a map at a point (as opposed to more global notions, such 
as speed vectors defined almost everywhere). One could in principle consider the case 
when O is a Riemannian manifold, but for simplicity we shall restrict to O = = M/Z 

throughout this section. 

6.1 Elements of optimal transportation 

We will not give much details on optimal transportation, but many references are avail¬ 
able (e.g. [Vil09] for a comprehensive source). Let us say that the 2-Wasserstein distance 
W 2 is a metric compatible with the weak topology, defined on 'P(O) as the least cost 
needed to move one measure to another, when the cost to move a unit of mass is pro¬ 
portional to the squared distance between the starting point and the stopping point. 

For each fi G V{S^), Gigli introduces a tangent space T^'P(§^) which may be only a 
metric cone, but turns out to be a Hilbert space in a number of cases. There are several 
possible definitions of such a tangent space (or cone), e.g. in term of geodesics, in term 
of measures on the tangent bundle, or in term of vector fields on the manifold; the work 
of Gigli ties all these points of view together when /i belongs to a certain class of “nice” 
measures. In the present one-dimensional case, the relevant class to be considered is the 
set of atomless measures. Assuming fi G R(S^) has no atom, one can consider as tangent 
space to P(§^) at fi the space 

T^iP(§') := LUa) ■■= 
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of vector fields on which are square-integrable with respect to fi and which are limits 
of gradients of smooth function in L‘^{n) (note that the quotient structure of = M/Z 
makes it possible to identify all tangent spaces of with M, so that we can see vector 
helds as functions, and V/ is simply /'). There is an obvious exponential map; given 
fi and V G T^P(S^) one sets exp^(t;) = (Id+n)^/i, i.e. the mass at any point a; G is 
moved to a; + v{x) mod 1. Then for each v G T^P(S^), one gets an exponential curve 
(exp^(fa;))tg[o,e) which has the property that 

W 2 {fi,exp^{tv)) = t\\v\\f, + o{t) 

where ||w||^ is the L^(/i)-norm of v (here the fact that v can be approximated by gradients 
is crucial). 

We will say that a curve t ^ Ht from an interval to P(S^) is Wasserstein-differentiable 
at to with tangent vector v G whenever it holds 

W 2 {fito+h,ew^t^{hv)) = o{h). 

Similarly, a map H : y V{Ei^) from a Banach space to the set of probability measures 
on is Wasserstein-differentiable at a point A G 3^ in a direction C ^ 3^ whenever there 
exist V G T^P(§^) such that 

W2(^H{A + tC),expfj(A)itv)) =o{t) 


i.e. the tangent vector v describes the hrst-order variations of H in the Wassertein 
distance. Of course, one can dehne more stringent versions of this dehnition (Frechet- 
like rather than Gateaux-like), but since our result is negative we get the strongest 
statement by sticking to the weakest dehnition. 

When O is a manifold, in each of its variations (Gateaux or Frechet), Wasserstein 
differentiablity is stronger than the corresponding variation of affine differentiability be¬ 
cause of the continuity equation below; roughly, affine differentiability is about recording 
the “vertical” variations of the measure, i.e. the variation of weight it gives to any given 
set, while Wasserstein differentiability is about recording the “horizontaf’ variations of 
the measure, i.e. how one should move the mass in the most economical way in order to 
obtain the given change of measure. The physical principle of mass preservation leads 
to the continuity equation, which in the present case G = has the following form; 


Lemma 6.1. Assume that {nt)t is a curve of probability measures on which is dif¬ 
ferentiable at 0 with tangent vector v G then for all smooth function (p we 

have 


d 

dt 


/ 


(pdpt 


t=o 


/ 


p'v d/io- 


The most common version of the continuity equation is stated for curves of measures, 
with the above equality integrated over time. The proof of the present version is very 
simple and can be found in [KlolSa]. 
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A curve {xt)t&i in a metric space is said to be absolutely continuous whenever there 
is a positive function g G L^{I) such that for all tQ,ti G I: 

d{xto,xt^)< j g{s)ds 

Jto 

(note when considering a curve {fit) in 'P{^)i fhat this notion as nothing to do with 
each measure fit being absolutely continuous or not!) In other words, an absolutely 
continuous curve is a curve whose speed exists almost everywhere and is integrable. A 
particular case is given by Lipschitz curves, whose speed is in absolute continuity is 
therefore a very mild regularity condition. A Rademacher theorem holds in this setting; 
an absolutely continuous curve in 'P(S^) endowed with the 2-Wasserstein distance is 
differentiable at almost every time and satisfies the mean value theorem (see [AGS05]). 


6.2 Roughness of the Gibbs map in the Wasserstein space 

We are now in a position to state and prove the main result of this section, which shows 
that the Gibbs map is very far from being Wassertein-smooth. 

Theorem 6.2. Assume T is x ^ dx mod 1 acting on and A’(§^) is the space of 
a-Holder functions for some a G (0,1]. If {At)t is any smooth curve in A’(S^), then its 
image curve {fiAt)t under the Gibbs map is not (even locally) absolutely continuous in 
(P(S^), W< 2 ) unless it is constant (i.e. unless At G Aq + C for all t). 

Recalling the interpretation of the Wasserstein metric W 2 above, we see that changing 
smoothly the potential changes smoothly the levels of the Gibbs measure (Theorem 
4.1), but in a way that corresponds to brutal reallocations of the mass distribution 
(Theorem 6.2). This result should be compared to Gorollary 1.3 in [KLS14], where a 
Lipschitz-regularity result is proved for the Gibbs map when 'P(G) is endowed with the 
1-Wasserstein distance (which however does not yield a differentiable structure). 

The proof mostly relies on the following point-wise non-differentiability result. 


Proposition 6.3. Under the same assumption as in Theorem 6.2, consider the Gibbs 
map G : HoIq(S^) —)■ 'P(S^) sending each A to fi a- 

If G is Wasserstein-differentiable at any potential A in any direction (, then either fiA 
is the Lebesgue measure (i.e. A eC) or the derivative vanish (i.e. W 2 {fiA+tc, Ta) = o{t)). 


Proof. Assume that G is Wasserstein-differentiable at A in the direction (. 

If (p is any smooth function, on the one hand the continuity equation gives 


It 


ip'v dpA 


where v G If{fiA) is some vector held (which can be approximated by gradients in 
L'^{Ta))'i on the other hand section 4 gives 


dt / 


t=o 


= {I-^A)-\TA)-DNA{C)dfiA. 
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We get two very different-looking linear forms in if which both describe the variations 
of its integral. The proof will thus be complete as soon as we prove that unless ha is 
the Lebesgue measure, these two forms can agree only by vanishing. 

For this, we use the following approximation lemma. 

Lemma 6.4. Let ji be a measure on which is singular with respect to the Lebesgue 
measure and without atoms; then for all f G and all /3 < 1 there is a sequence of 

smooth functions —)■ M such that ip'^^ f in L‘^{jji) and </)„—)■ 0 in Hol^(S^). 

Proof. We hrst claim that when I C [0,1] is an interval of length £, tc : / —)■ M is 
measurable and /i-essentially bounded by some number M, and e > 0 , there is a smooth 
function 99 : / —)■ M such that ip and all its derivatives vanish at the endpoints of I, 

llv^lloo < llv^lloo < Tf, and - wf dp < 

Let ?7 > 0 be arbitrary, to be chosen later on. Since p is concentrated on a A-negligible 
set, there is a hnite set of intervals Ji,...,/fc C I with disjoint interiors whose total 
length is less than rj and whose complement J = J \ (Ji U ■ ■ ■ U 1^) is given by /i a mass 
less than rj. Let Wi be the function which; 

• is constant on each li, with value the /x-average of w on li, 

• is constant on J, with value such that Jj wi dA = 0. 

By taking 77 small enough and by dividing the intervals Jj into smaller intervals, we can 
ensure that d/x is arbitrarily small. 

Let W 2 be a smooth approximation of Wi such that f (w — W 2 )^ dp stays small, W 2 is 
bounded by M, fj-W 2 = 0, and W 2 is zero on some neighborhoods of the endpoint of I 
(this last condition is easy to fulhll since p has no atom). 

Dehne a smooth, M-Lipschitz function p by 

rx 

p{x) := / W 2 {t)dt 
J a 

where a = mini is the starting point of I. Then p' = W 2 is close to w in L‘^{p, I) norm 
and bounded above by M (though p" is extremely large), and p and its derivatives 
vanish at both endpoints of I. The uniform norm of p is then bounded by Mp, and can 
thus be made arbitrarily small, proving the claim. 

Now, given v and an integer n, choose a /^-essentially bounded function v which is 
1/n-close to v in Lf{p), call M its essential bound, then choose i small enough to ensure 
that i^~hM < 1/n. Divide into intervals of length i and apply the claim to each 
of them. The boundary conditions enable us to glue the smooth functions dehned on 
each interval into a smooth function p^ dehned on such that p'^^ is M-bounded and 
1/n-close to v in L‘^{p) and ||v9n||oo < ^/n. For any x,y G when \x — y\ < i we get 


\Pn{,x) -Pn{y)\ 

\x — y\h 


<Wn\\\x-y\‘-'‘ -iMe-o < 


1 

n 


32 



and when \x — y\ > ^ we get 

\Vn(x) -‘fn{y)\ ^ Ifnix) - 0\ + \0 - 2Mi 

\x-y\f \x-y\^ -\x-y\f- 

This proves the Lemma. □ 

Now we simply apply the Lemma to f = v, and /3 = a if a < 1, or any lower f5 
otherwise (using that the thermodynamical formalism holds for the current T with any 
/3). This gives us smooth functions ipn such that 

j d/iA = j (fn d/iA+tc ^ J~ ■ DNa{C) d^A = 0 

(every operator being interpreted in the /3-Holder space if necessary) so that v vanishes 
/iA-almost everywhere, and the Wasserstein derivative of /iA+tc vanishes. □ 

Proof of Theorem 6.2. If is absolutely continuous, it is differentiable almost-every- 

where and from Proposition 6.3 we deduce that at each t such that is differentiable 
and not Lebesgue, its derivative vanishes. The mean value inequality then ensures that 
(/iAt) must then be constant. □ 

We end this section with some open questions. First, Proposition 6.3 leaves open the 
following. 

Question 6.5. In the case of T : x eA- dx mod 1, is the Gibbs map differentiable at A 
when H G C (i.e. when /xa is the Lebesgue measure)? 

Second, note that the analogue of Theorem 6.2 for the shift is true independently of 
the map G, since the 2 -Wasserstein space of an ultrametric space such as .4.^ contains 
no absolutely continuous curve at all (see [KlolSb]). But the 2-Wasserstein space of a 
manifold contains plenty of absolutely continuous curves (it is even a geodesic space), 
so when G has a smooth structure, the irregularity of G with respect to the Wasserstein 
metric can be somewhat surprising. One then wonders how much it has to do with G, 
and how much it has to do with its image: 

Question 6.6. Assume O is a manifold and T is smooth. Are there any non-constant, 
absolutely continuous curves (/it)t m (7^(0), IF 2 ) such that fit is T-invariant for all tl 
What about the subset of Gibbs measures with a-Holder potential? 

In other words, we ask whether the set of T-invariant measures is a nice, somewhat 
smooth subset of the set of all probability measures, or if from the Wasserstein point of 
view it is a very irregular subset of 'P(S^) (one can think of the Von Koch curve in 
as an example of a connected, very irregular subset of a smooth space). 

7 Application to equilibrium states 

In this section we use the differential calculus developed above to study several classical 
optimization problems. 
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7.1 Entropy and pressure 

Given our broad framework, we shall use the following Legendre transform dehnition for 
entropy: for any T-invariant measure u, we set 

h.x{iy) := inf (logA^- / A 
AgA' j 

Note that this quantity a priori depend on the chosen class of function d:’(r2); but in 
many cases it is in fact equal to the metric entropy of p, see remarks 7.5 and 7.8. The 
assumption (H6) ensures that d:’(r2) is quite large, preventing hx to be too degenerate. 

Remark 7.1. The number log Aa — f Adv only depend on the class [A] of A modulo C 
(adding a constant to A changes logA^ and J Advhy the same additive constant, and 
adding a coboundary leaves both terms unchanged). In particular, one can rewrite 



\ix{y) = jnf^ / (-71) dz/ 


and observe that A{y) = log where 

P(a; —)■ y) ; = 


^Aiv) when T{y) = x 
0 otherwise 


dehnes transition probabilities for a Markov chain on G supported on backward orbits 
of T. In other words, Hy is the inhmum of J ( — logP(T(|/) —)■ y)) (iv{y) over Markov 
chains supported on backward orbits of T, such that transition probabilities depends on 
the endpoint, with a regularity specihed by ^’(G). 

Together with such a dehnition of entropy naturally comes a dual quantity, the pres¬ 
sure: for any potential B G ^’(G) we set 

Pt{B) := sup [hx{y) -\- [Bdfi). 

In many cases (e.g. shift in the Bernoulli space), this turns out to coincide with the 
classical topological pressure (see again remarks 7.5 and 7.8). Here we will concentrate 
on the study of the above Legendrian formulations for these quantities, as they ht our 
framework most naturally. 

One of our main concern is to understand when and where the above inhmum and 
supremum are attained; we thus consider the families of functionals dehned for p, z/ G 
VriAl) and A,B e T’(O) by 


Hu{A) = logA^ - J Adu 
Pb{^) = h.x{y) + j Bdy 
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The functional Pb is defined for all T-invariant measures but we shall also study its 
restriction to Gibbs measures, considered as acting on potentials; 

Pb{A) = + j Bd^A- 

We will abusively use the same name Pb for the map dehned on invariant measure, 
the map dehned on potentials, its restriction to normalized potential and the map it 
induces on the quotient Q = X{yL)/C. The way we write the argument {Pb{iaa)i Pb{A) 
or Pb{[A])) will usually make the difference clear. 

Since Hy is C-invariant,it induces a functional on the quotient Q, which we still denote 
by 

7.2 Classical equilibrium states and Legendre duality 

7.2.1 The entropy functionals 

We start with the study of the functionals H^. 

Proposition 7.2. For all v G Vt{^), the functional on ^’(G) is analytic with 

D{H,)a{C) = jcdfiA- j (dn. 

Moreover the map [A] i—)■ H^{\A\) induced on Q is strictly convex. 

Proof. Let us recall that A : A:’(r2) —)■ (0,+cx)) is the analytic functional dehned by 
A(A) = Aa, and that for all A, G A:’(r2) we have iA(logA)A(C) = / C dpA (Corollary 
3.6). Since the second term in H^{A) = logA^ — / Adz/ is linear and thus analytic and 
equal to its derivative at any point, is analytic with D{Hi,)a{C) = / C dp,A ~ f ( du. 
The second term is constant in A, and by the work of Sections 4 and 5 the second 
derivative is given by 

= DGAivm = {vX)a. 

In other words, considering the functional Hy induced on Q we have = (•, •)[^] 

which is positive-dehnite, proving the strict convexity on Q. □ 

Note that we do not have uniform convexity (even locally) since the inner product is 
only weakly positive-dehnite (there are directions [(C] with hxed size ||[C]|| such that the 
“convexity” ||[C]||[A] is arbitrarily small). Of course, Hy is only weakly convex on X{VL) 
since it is constant along each hber A -|- C. 

Proposition 7.2 now implies the following result. 

Corollary 7.3. When n = pb for some B G XiM), then is uniquely minimized 

at [A] = [B] and thus 

hA’(AB) = log Ab - y B dpB = - J N{B) dps. 

When V is not in the image of the Gibbs map, does not reach its infimum. 
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Note that a normalized B is non-positive and non-zero and has Xb = 1, so that 
> 0 (use (H5) to get the strict inequality). 

Proof. Hypothesis (H6) implies that d:’(r2) “separates measures” i.e. 

Using D{H^)a{C) = f(d(MA — u) we see that when u is not in the image of the Gibbs 
map has no critical point, hence no minimum; and when u = fis the critical points 
of are exactly the potentials A such that fiA = Pb- Going down to the quotient we 
get only one critical point [B] and the strict convexity implies that this critical point is 
the unique minimizer. □ 

Remark 7.4. At hrst glance, it looks like we used that the Gibbs map G : A i—)■ is 
one-to-one in this proof, while we were only able to prove it in some cases in Remark 2.11. 
But in fact, the above proof rather implies the injectivity of G, as by strict convexity 
for all p it can exist at most one critical point of H^, on Q. 

Remark 7.5. When z/ = /i^ is a Gibbs measure we thus obtain 

^x{,Pb) = j ( - loge^^^^^^^) d/iB(|/) 

where can be interpreted as a transition probability, or as the Jacobian of “dp/d/io 
T” (Remark 2.11). This can be used for some (G, T, A’(G)) to show that h.x{pB) is equal 
to the metric entropy h{fiB)', in particular this is the case for the shift a acting on the 
Bernoulli space with Holder potentials (the Glassical Thermodynamical Formalism 
in the sense of [PP90]). 

In this case (a, HoR) the equality h.x{i^) = h{v) extends to any invariant prob¬ 
ability V. Indeed by Theorem 9.12 in [Wal82] for any a-invariant probability v on the 
Bernoulli space, the metric entropy h{v) satishes 

where P is the topological pressure. As topological pressure is a continuous function on 
the continuous potential A (see Theorem 9.7 in [Wal82]) and the set of Holder functions 
is dense in G'^(A.^), the inhmum above can be restricted to the Holder potentials A. For 
Holder potentials the pressure satishes P{A) = log and this shows that = h{v). 

Of course this reasoning applies to all cases when the topological pressure coincides with 
log A and the metric entropy is the Legendre dual of pressure. 

The analogous results is proved for Gibbs plans in Lemma 6 in [LMMS]. An invariant 
probability is particular case of a Gibbs plan (see equation (1) in [LMMS]) and this 
provides another proof for the equality of entropies. 
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7.2.2 The pressure functionals 

We are now in a position to extend the following classical result to our general framework. 

Theorem 7.6 (Gibbs measures are equilibirum states). For all B G X{VL), we have 
Pr(/iB) = log As and fas is the unique maximizer of Pb{ij) = + / i? d/i among all 

T-invariant probability measures. 

Proof. Simply observe that 

PB{ia)-\ogXB= inf H^{A) - H^{B). 

Consider the functional A i— )■ H^{A) — H^{B)\ it takes the value aX A = B and by 
Corollary 7.3 this is its inhmum precisely when p, = pB- We deduce that Pb{,1^b) = log Ab 
and that for any other measure p G 'Pr(C), Pb{,Ij) < logA^. □ 

Remark 7.7. The expression Pt{pb) = logA^ shows that Pr and hx are really Legen¬ 
dre duals one to the other, since we can now write the later hx{f^) = inf a ^ dp. 

Remark 7.8. We can deduce from that result that hx is the metric entropy and Pr the 
topological pressure whenever we know the later to be equal to log A and the former to 
be its Legendre dual. In particular, this holds when T is the shift over a hnite alphabet 
and X = HoIq, but of course in this case it is possible and more satisfactory to prove 
that hx and Pr are the classical quantities® and recover their interpretation in terms of 
eigenvalue and Legendre dual by the above. 

Remark 7.9. As a particular case, the measure of maximal entropy is unique and equal 
to Pq where 0 is the zero of ^(17). One can then describe Pq as the stationary measure 
for the Markov chain on 0 dehned by the normalized potential A^(0). When T is d-to- 
one, then — logd is obviously normalized and in the class of 0 modulo C, so that the 
measure of maximal entropy is the stationary measure for the uniform random walk on 
backward orbits of T. 

However, this hides some complications when points of C do not all have the same 
number of inverse images under the action of T; it might then be quite difficult to 
express A^(0). 

7.3 Gradients and gradient flows 

7.3.1 Computation of some gradients 

We can now use the metric (•, ■)a to dehne the gradients of the functionals hx and Pb- 
Note that a weak Riemannian metric such as (•, ■)a does not give a gradient to all 
functionals: indeed (•, •)a induces a continuous, one-to-one map from the tangent space 


®For example one can proceed as in [LMMS15], noting that we use here the classical normalization. 
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of M to its dual, but this map is not onto7 Only those functional whose differential 
belong to the image of this map will have a gradient. 

First, the results of Section 4 and the very dehnition of the metric yields that : 
[A] I— )■ f (pd^A dehned on the quotient Q has a gradient; 

D{G^)a{C) = {^X)a 

= (M. IC])m 

VG,(|A]) = I,,]. 

Similarly, the function A i-a- f (p djUA dehned on Af has a gradient at A, given by DNa{<p) 
(recall that the gradient must be a vector in T^A/” = her ^a and that DNa is precisely 
the projection on this space along C). 

Then, we consider the map A i—)■ hA(/UA)- As before, we will abusively denote hy hx 
this map, as well as its restriction to Af and the map it induces on Q. 

From h.x{,A) = — f N{A) dpA, the product rule yields 

D(1ia0.4(C) = - / DNAiOdl^A- {N(A),C)a 
^{-A,Oa 

since DNa{C) ^ ker^A C ker/i^ and C = ker(-, ■)^. This computation shows further 
that hx (now considered as induced on Q or restricted to Af) has a gradient; 

Vh;t>([A]) = —[A], or again Vh.x{A) = —DNa{A) when A e Af. 

Observing that Pb{A) = h.x{A) + Gb{A) we thus proved the following. 

Proposition 7.10. The maps G^, hx and Pb have gradients for the weak Riemannian 
metric {a')a, given by 

VG^([kl]) = M VG^(kl) = DNa{p>) 

Vh;,([A]) = -[A] Vh;,(A) = -DNAiA) 

V(PB)([kl]) = [B-A] V(PB)(kl) = DNAiB - A) 

where the functionals are considered either on Q (left column) or Af (right column). 

7.3.2 Gradient flow 

One particularly nice feature of the gradient of the pressure Pb computed in Section 
7.3.1 is that it straightforwardly induces a gradient how; for all [Aq] G Q, there is a 
diherentiable curve [Af\ such that for all t 

^If it where, by Banach’s isomorphism theorem the map C i—> (C, (a from X{Q) to its dual would be 
an isomorphism, which is equivalent to (,‘)a being strongly positive-definite. 


38 



Indeed, a solution is given by 


[A] — e * [Aq — B] + [B] . 

Let us give a physical interpretation when T is the shift; we consider a system con¬ 
sisting of a Z-lattice of particles, a potential Aq then represents a combination of the 
interaction (and self-interaction) energy of the particles and of the temperature, the 
Gibbs measure ^Aq is an equilibrium state (which minimizes the “free energy” —Paq) 
and represents the macroscopic state of the system at equilibrium. Assume now that 
this system interactions changes instantly to be now described by the potential B. The 
gradient flow above is a natural and simple model for the evolution of the macroscopic 
state of the system, where the systems evolves “driven” by B. Note that in this interpre¬ 
tation, the state of the system out of equilibrium is an equilibrium state for a varying 
potential. 

Remark 7.11. Let us consider a particular case, where the interactions are constant 
and only the temperature changes: Aq = P(p and B = Pip for some ip G A’(G); 
this corresponds to a system in contact with a heat bath whose temperature changes 
suddenly. According to our model, the system then evolves only in its temperature, as 

[At] = e *[Ao - B] + [B] = (^e *(— - —) -h — j [</;’] 

will be proportional to [p] for all t. Note that here, t should not be considered as the 
time as the speed of evolution of temperature would not be right. It might be possible 
to give a physical interpretation to the parameter t, or to rescale the functional Pb and 
the metric in a way to obtain a physically sound evolution of the temperature. 

Remark 7.12. Beware that this gradient flow really takes place on Q (or equivalently, 
on M): it is not dehned on the whole of A’(G) because there the metric has a non-trivial 
kernel. Also, we cannot see this gradient flow as taking place in the set of invariant 
measures with the Wasserstein structure, because of Section 6; the Gibbs map is not 
differentiable, and when {At) is a integral curve of our gradient flow, the curve {pAt) is 
not absolutely continuous (Theorem 6.2) and in particular not a gradient flow curve in 
the sense of Ambrosio, Gigli and Savare [AGS05]. 

7.4 Prescribing integrals 

In this section we study how one can hnd Gibbs measures with prescribed values for 
the integrals of a given set of test functions. This is both an application of the tools 
we introduced here (in particular, the weak metric of Section 5 makes the proof quite 
easy), and a main ingredient in the proof of existence and uniqueness of equilibrium 
states under linear constraints. 

Fix a tuple of test functions $ = {pi,... ,pk) G X{n)^; we want to study the set 
Rot(<h) of possible values taken by the rotation vector 

rv(/i) = ( f Pi dp,..., f pxdp) 
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where /i runs over the set Vt of T-invariant probability measures, and the freedom one 
has to prescribe the values of these integral with respect to a Gibbs measure. 

It is well-known and straightforward that Rot(<h) is convex; it must also be bounded 
since potentials are assumed to be bounded by (HI). 

Observe that if the classes modulo C of the ipk are linearly dependent, then their 
integrals with respect to any invariant measure must satisfy a linear relation. Let us be 
more specihc: ii g — g oT -|-cis any element of C and fi is any T-invariant probability 
measure, then J{g — g o T + c) dfi = c. Therefore, if there is a non-trivial relation 
Y2 Xk[^k] = 0 then there are g & X and c G M such that ^ Xk^Pk = 9 — 9 °T + c and for 
all fi G Vt we get the relation f P’k dju = c, constraining the vector of integrals to 
an affine subspace of . But this constraint on the rotation vector can be worked out 
from the (pk, and one can restrict to a maximal subset of indexes S C {1,..., iL} such 
that the family ([ 9 ?fc])fces is linearly independent. Then the corresponding integrals will 
determine the integrals of all ipk- This procedure reduces the problem to the case when 
the [(pk\ are linearly independent, which we will always assume in the sequel. 

We then get the following (which does not pretend to much originality, see [KW14] and 
[JenOl]; note that our proof is close to the one by Kucherenko and Wolf, but the metric 
{•, ■)a makes the injectivity of the Jacobian obvious and we use a differential-geometric 
argument to show that the map is onto). 

Theorem 7.13. Let $ = ( 991 ,... ^ipx) £ X{VL)^ be such that the classes [ 991 ],..., [p>k\ 
modulo C are linearly independent. Then for all B G X{VL), the map 

-)■ intRot(<l>) 

(fll, . . . , ax) '“t TY(^fiB+anpi-\ -hciKV’if) 

is an analytic diffeomorphism; in particular Rot(<l>) has non-empty interior and all its 
interior values are achieved by Gibbs measures. 

Proof. Consider the analytic maps 

I :R^ ^ T’(fl) 

d = (tti,..., ax) B ttfcTfc 
where B is any hxed potential, 

J : A’(fi) ^ R^ 

[j Lpi dpA, ■■■, J P>K dpA) 

and their composition L = J o I : R^ —)■ R^. We also denote by Lk the k-th component 
of T, i.e. Lk{d) = J cpkdppA)- 
The differential of L is given by Sections 4 and 5: 

= {[p>k], N )/(«)• 
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This defines a Gram matrix, which is invertible since the [ 9 ?^] are linearly independent; 
it follows from the local inverse function theorem that L is a local diffeomorphism. 

If B is any potential, this implies that L{B) is in the interior of the image of L, in 
particular in the interior of Rot(<h) (which must thus be non-empty). 

What we have left to prove is that L is a global diffeomorphism from to int Rot(<h). 
Since that interior is diffeomorphic to M'^, a Theorem of [Gor72] reduces this to prove 
that L is proper when its codomain is taken to be intRot(<h), i.e. that whenever 
a sequence escapes compacts of the points L{x^'^^) escapes the compacts of 
intRot(<h). In other words, we want to prove that if —)■ 00 and converges, 

the limit lies on (9Rot(<h). 

Now, if x^"'^ —)■ 00 and L(x^'^^) converges, up to taking a subsequence we can assume 
that X(n) = tnU + o(tn) where (tn) is a diverging sequence of positive numbers, and u is 
a unit vector in (this is simply the compactness of the unit sphere). 

Observe that if a; is a boundary point of Rot(<h) and (where (e^) is the 

canonical dual basis) is a linear form of whose maximum on Rot(<h) is reached at a;, 
then 


<h(a;) = max Vkifk d/a I /a G Vrj 


and reciprocally points maximizing a linear form must lie on the boundary. 

Back to L{x^^'>), we have /(a;*^”^) = tn^Pu + o{tn) where (pu = '^UkP’k- The variational 
principle tells us that /i 7 (^(n)) maximizes + f {tn'Pu + o(tn)) d/a and it follows that 

the accumulation points of this sequence of measures are all maximizing measures of ipu- 
This precisely means that the limit of L{x^^'^) is a boundary point, and we are done. □ 


As a by-product of this result, we get the following. 

Corollary 7.14. If X{Q) is separable/ then the set of Gibbs measures G(A’(r2)) is 
weakly dense in Vt{SI). 


Proof. By assumption, there is a sequence {tpk)km of elements of A’(r2) such that all 
continuous / : G —)■ M is the uniform limit of a subsequence (</?*;JjeN- 

Let /a G Vt/T)] from Theorem 7.13, for each 77 G N there is a potential Ak G ^(12) 
such that 


J Pk d/an^ 


(fk d/a 



VA;G{l,...,i7}. 


Given any continuous / : G —)■ M and any e > 0, there is some ko such that ||/ — 
'T’fcolloo < For all K > max(A;o, f) we thus have 


fdpA,. 


/d/a 


< 


fdpA, 


Pko d-pAj, 


Pko d-fiAj, 


Pko d/^ 


Pkodfi- / /d/a 


< 3e 


Letting e —)■ 0 , we see that f f d/a^^ I f so that {fiAK) converges weakly to /a. □ 

®Or more generally if in (H6) the approximation can be obtained from a fixed countable subset of X 
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7.5 Optimization under constraints 

Our goal here is to optimize the Pb functionals (for example, the entropy h.x ) on natural 
subsets of invariant measures, obtained by constraining the integrals of some functions. 
These questions have been considered by Jenkinson [JenOl] in the case of entropy and 
Kucherenko and Wolf [KW14, KW13], with somewhat different assumptions and meth¬ 
ods. We believe that part of our claims are more explicit in some issues. 

We £x as before test functions $ = (<pi,..., (px) £ and we consider the set 

P'r[‘h] of T-invariant measures fi such that f (fkdju = 0 for all k] among them are the 
Gibbs measures whose normalized potential lies in 

A/'[<h] -.= [A e M \ \/k : J dfXA = 0 } 

We will also denote by Q[<h] the set of classes [A] E Q = X{yL)/C such that A e A/’[<h]. 

With these notation, we will prove the following constrained (or “localized”) version 
of the variational principle. 

Theorem 7.15. Let $ = G be such that the [ipk] linearly 

independent, and such that 0 is an interior vector o/Rot(<h). For each B G ^(12) denote 

by Bq the unique element Bq = B + ai<pi -I--|- such that [Rq] G Q[<h] (Theorem 

7.13). 

Then pbo uniquely maximizes Pb over Vt[^], and the value of the maximum is 
Pb{Bo) = log Xbq. 

Proof. We simply observe that for all p G 'Pr[*h] we have 

PB(/i) = hxiia) + j (Bo - aiLfi - ok^Pk) dp = hx{fT> + j Rodp = Pbq{t)- 

Applying Theorem 7.6 to Pbq we see that Pb^Pbo) = Pbo{TBo) = log-^So greater than 
Pb{t) = Pbo{t) whenever p 7 ^ is in Vt[^]. □ 

We can use this to recover in our setting another result from [KW14]. 

Corollary 7.16. Let $ = {cpi ,..., ipx) £ X{VL), such that the [(pk\ are linearly indepen¬ 
dent and B G X{VL), and for w G intRot(<h) define 

H{w) = sup{hA’(p); rv(p) = w}. 

Then H is a positive, analytic map. 

Proof. By Theorems 7.13 we know that there are uniquely defined analytic functions 
Ofc : int C —)■ M such that 

rV (pai- Ul VtC 

Setting A{w) = ai{w)p>i-\-- ■ ■+aK{u})(pK and applying Theorem 7.15 to ((pi—tci,..., P>k— 
wk) we obtain 

H{w) = hx{pA{w)) = logA(A(w)) - ai(t(;)wi- aK{w)wK, 

proving the claim. □ 
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Remark 7.17. Assume that T is the shift over a hnite alphabet and X = Hola (recall 
that h{fi) = in this case, Remark 7.5). Let n be any positive integer, and let 

$ = {(fi ,..., ipx) and B be Holder functions that only depends on the hrst n coordinates, 
and such that Q[<h] is non empty. 

Then we claim that there is a unique measure maximizing Psif^) among all elements 
of Vt[^], and that this measure is a (n — l)-steps Markov measure (i.e. a Gibbs measure 
fiA such that N{A) only depends on the hrst n coordinates). In particular, applying this 
to R = 0, there is a (n — l)-steps Markov measure maximizing the entropy subject to 
any hnite set of simultaneously satishable constraints J = 0 whenever the (pk are 

constant on cylinder of depth n. 

Proof. The only point that does not follow immediately from Theorem 7.15 is that pA 
is n-Markov. But we know that we can take A = B + ^ XkPk for some (x^); notice that 
this A might not be normalized, but is constant on each depth-n cylinder. 

Now, JfA preserves the subspace of ^(11) made of functions that only depend on the 
hrst (n — 1) coordinates. In particular, for all N the function only depends on 

the hrst (n — 1) coordinates. Since this is a closed space, the leading eigenfunction Ha 
only depends on the hrst (n — 1) coordinates, and Ha o T only depends on the hrst n 
coordinates. 

Now N{A) = A + log Ha — log hA°T — log only depend on the hrst n coordinates, 
which precisely means that ha is {n — l)-steps Markov. □ 

Let us give a couple of examples, which we will not make as general as possible but 
we will intentionally keep very explicit. Let G = {0,1}^, T be the shift and ^(11) be a 
space of Holder functions for one of the usual metrics of G. Given any hnite word u, let 
u* be the cylinder dehned by u, i.e. the set of words starting with u. 

Example 7.18. Among shift-invariant measures p such that /i(0*) = .9, the Bernoulli 
measure of parameter .9 (i.e. the law of the word 0 : 10:2 ... where the Oj are i.i.d. random 
variables taking the value 0 with probability .9) maximizes entropy. 

Indeed, from Remark 7.17 we know that there is a Bernoulli measure realizing this 
maximum, and the Bernoulli measure with parameter .9 is the only one to satisfy the 
constraint. 

Example 7.19. Among shift-invariant measures /i such that /i(01*) = 2p{ll*), the 
Markov measure associated to the transition probabilities 

P(0 ^ 0) = 1 - a P(0 ^ 1) = a 

P(1 ^ 0) = ^ P(1 ^ 1) = ^ 

where a is the only real solution to 

4 

(1 - af = —a^ (a ~ 0.487803) 

^ i 

maximizes entropy. 
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It is easily seen that the constraint is satisfiable by a Markov measure, in particular by 
a Gibbs measure, thus we can apply Remark 7.17 to R = 0, R' = 1 and ip = lio* —2 - In* 
where I 5 is the indicator function of the set S. 

The constraints easily translates into P(1 —)■ 0) = |, and we define a = P(0 —)■ 1). We 
know that the Gibbs entropy maximizing measure is given by a potential of the form 
A = xip where a; G M; to translate this into the transition probabilities, we only have to 
normalize A\ 

N{A) = Xip + \ogh — logh oT + log A 

where A G M and h only depends on the first coordinates and matters only up to a 
multiplicative constant; we thus define a = h{0*)/h{l*). Letting rj = e*, we then 
recover the transitions probabilities as follows: 

P(0 ^ 0) = 

P(0 ^ 1) = 

P(1 ^ 0) = 

P(1 ^ 1) = 

We then have to solve the system 

1 — a = A 
a = ria~^\ 

2/3 = aA 
1/3 = ri-‘^\ 

This will give the only rj such that pA with the above A satisfies the constraint, and 
from Remark 7.17 we know that pA maximizes entropy under this constraint; then the 
corresponding value of a gives the transition probability we seek. Note that, while we 
have some computation to do, we do not have to estimate the actual entropy of Markov 
measures, nor do we have to compute directly the eigendata of ^a- 

The above system is easily solved by substitution; A = 1 — a, then a = 2/(3(l — a)), 
T] = 2a/(3(l — aY) and finally the last equation yields [2a/(3(l — = 3(1 — a), so 

that (1 — aY = 

8 Explicit computations for a restricted model 

In this section we explicitly show an example of the construction of section 5 and some of 
its consequences. The dynamic we consider is the shift acting on the space {1,2}^. We 
choose X to be the space of a-Holder functions for any a, and denote by X2 the subset of 
potentials which depend only on the first two coordinates (of elements in {1, 2}^). Note 
that we formally cannot take X2 as our full space of potentials, since it is not invariant 
under composition by T. It is easy to check that, in this setting, (H1)-(H6) are satisfied 
(or one can find all the details in [PP90]). 


= A 

= r]a~^\ 
= a\ 

= T]~^\ 
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8.1 A positively curved metric 


If a potential A G .^2 depends just on two coordinates then we can write A{i,j) for the 
value of A evaluated on the cylinder ij* (i.e. the elements of {1, 2}^ of the type ij ■ ■ ■), 
and we shall identify X 2 with the space of 2 by 2 real matrices. The value of is well 
dehned on such a cylinder, and the action of the operator on potentials 99 depending 
only on the hrst coordinate explicitly reads 


= (0(1*) 0(2*)) 



We can thus think of the operator ^a as acting on a function as a left multiplication of 
the matrix, and we shall denote by L the map 


L : A 


[An 

\^21 


^120 
^22 / 


I—)■ La — 



gAi2 

gA22 


which exponentiate each coordinates of a matrix A, and identify freely ^a and La- 
To normalize the potential, that is to hnd the potential A := N{A) differing from A by 
a coboundary and a constant such that -^^^(l) = 1, we can apply the Perron-Frobenius 
theorem® to the matrix La and solve with respect to the maximal eigenvalue and the 
left eigenvector, i.e. ILa = After the normalization, we obtain 


MlJ) 




From now on, we will assume that A is normalized and avoid the notation A. 

We observe that the set N{X 2 ) =: A /2 of normalized potentials depending on two 
coordinates is dehned by the equations 

f = 1 

1 ^ gA22 = X 


so that T(A/ 2 ) is the set of 2 by 2 column stochastic matrix, denoted by 82 - 

To sum up, a normalized potential in M 2 can be represented by the matrix of its values 
on cylinders, subject to a nonlinear system of constraints, or as a column stochastic 
matrix after coordinate-wise exponentiation. We thus obtain a natural chart S : [0,1] x 
[ 0 , 1 ] —)■ ^2 by setting 


s{xM 


X 

1 — X 


1 - 1 / 

y 


where x,y & (0,1) can be thought of as transition probabilities P[1 —)■ 1] and P[2 —)■ 2], 
respectively. 

This parametrization has the advantage that 82 is (an open set of) an affine subspace 
of M 2 ^ 2 {^)'- h has the same tangent space at each point, a basis of which is given by 


^ / 1 OA ^ 

dx 1 0 ) dy 

®see for example [Gan59] for the exact statement. 
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A tangent vector ip to S 2 at S{x, y) shall be written as {ipi,ip 2 ) in this basis, so that the 
corresponding parametrized line 7 can be expressed for s G M sufficiently small by 




/ X + SIpi 1-y- ^ ^ 

\1- X - sipi y + Sip 2 ) ^ 


The expression above is very readable, though does not allow us to compute right 
away the metric of T^A/ 2 . However, given A G A /2 and G by Item 2 of Theorem 

5.1, we have that 

(c,c)a= y 

It will thus be convenient to work both in A /2 where the functional interpretation of 
matrices and vectors is clear, and in S 2 where the Gibbs measures naturally appear. 
If we consider a variation exp{Aij + sQj) and differentiate at zero we obtain that the 
system 

r e^iiCii + e^"^C2i = 0 
\ e ^^ Xi 2 + 6^22(22 = 0 

dehnes TyiA /2 C A 2 (in particular we see that this tangent plane depends on the point 
A). If La = S{x,y) and ip corresponds to C in Tl^S 2 , i.e. ip = DLa^Qi if comes 


Cii C12 


Now the matrix S{x,y) has a right eigenvector 


i’l 

-1p2 

X 


-7i 


1—x 

y 


71 = (7r(l*), 7r(2*)) = 


1 -1/ 


1 — X 


2 — X — y'2 — X — y)' 


which is the invariant measure on {1,2} of the Markov chain dehned by A, and the 
measures of cylinders with respect to fiA are 


/i(ll*) = P[1 ^ l]7r(l) 

/r(12*) = P[2 ^ l]7r(2) 
/r(21*) = P[1 ^ 2]7r(l) 
/r(22*) = P[2 ^ 2]7r(2) 


It is now easy to compute the metric: 






x{l-y) ipl {l-y){l-x) 
x‘^ 2 — X — y {1 — yY 2 — x — y 


ipl {I-x){l-y) ^ip^ y{l-x) 


(1 — xY 2 — X — y y'^ 2 — X — y 


Yt + 


X 


2 -x-y Va:(l - x) y{l - y) 


-M 
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Proposition 8.1. The restriction g of the variance metric (■, ■)a to A /2 is given in the 
chart S by 

ATzy) - 0 

(i-A 


„ I x(l-x)(2-x-y) 

9a-\ g 


m 


y(l-2/)(2-a;-y), 

(which is positive-definite for all x,y ^ (0,1) x (0,1 )). 


This means that for La = S{x,y) and ip = DLa{C) have |C|^ = {'fii '^ 2 ) dA • 

Remark 8.2. Observe that, not incidentally, by recalling the proof of 5.2 we could have 
computed {(, Qa in a more roundabout way by using the equation contained there 

(C,C)a = /l^(logA)A(C,C) = D{G^)a{C). 

As a side effect we easily obtain that 

x{l-y) .2 , {^-x)y 


Dfi\ogA)A{CO = 


rCii + 


{l-x){2-x-y) {l-y){2-x-y) 


‘522 


( 8 ) 


We see for example that when x or y goes to 0, the pressure becomes very flat (as 
opposed to very convex, i.e. its Hessian goes to zero). 

From the metric tensor, we compute the curvature at each point. For simplicity, if we 
E 0 

then we use the explicit formula for the curvature 


let gA = 


0 G 


K{A) = 


1 

2 v^ 



where subscripts indicate partial derivatives with respect to the indicated variables. The 
expression simplifies greatly (see Section 8.3); 

Corollary 8.3. The Gaussian curvature of g at A is given when La = S{x,y) by 

1 

{2-x- 



Remark 8.4. In the case at hand, the curvature its always strictly positive. In fact, it 
is even bounded away from 0, so that A /2 endowed with g is not complete (indeed, if g 
where complete then the Bonnet-Myers theorem would imply that A /2 is compact). 


8.2 Rescaling the metric 

We considered in the previous reasoning the Riemmanian norm {(, ()a of a tangent 
vector C at the potential A given by the asymptotic variance, as in theorem 5.1. We 
wonder how rescaling the metric by the entropy would effect such curvature, based on 
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previous work by McMullen [McM08]. Given the eigenvector tt of the previous section 
(which corresponds to the eigenmeasure) the entropy is given as a function of x, y by 

1 — y 

h(x, y) = - (x log(a;) + (1 — a:) log(l — x)) 

2 — X — y 

1 — X 

- 7, -((1 - y) log(i -y) + y iog( 2 /)) 

This function is always positive on (0,1) x (0,1) and is 0 in the limit to the vertex 
(0,0) and the edges {1} x [0,1] and [0,1] x {1} (Figure 2). Note that there is a strong 
asymmetry between the cases x = Q and x = 1 (similarly for ?/), as x = 1 means the 
Markov chain gets stuck at the state 1, while x = 0 means the random walk is always 
repelled away from state 1 , but then can either stay at 2 or come back to 1 , leaving 
enough uncertainty to yield positive entropy. 


0.7 



Figure 2: The entropy in {x,y) coordinates. 


We rescale the metric associated to the matrix qa of the previous section to a new qa 

f- o\ 

in the interior of the square by setting ~ G j where h is the entropy functional. 

We denote K the curvature associated to the metric g and K the one associated to g. 

After a little bit of juggling with the equations, for the strictly positive function h(x, y) 
one gets 


K 

~h 


K 


2^/m 


( (VEhy 
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The explicit expression of K is particularly long and tedious to handle. We use the 
software Maxima both to do the necessary symbolic manipulation and the plot of the 
graph (Figure 3). In the case of some subshifts related to Fuchsian groups, McMullen 
showed that this precise scaling of the metric identifies with the Weyl-Peterson metric 
on Teichniiiller space, which is known to be of negative Ricci curvature. One could thus 
expect that g has negative curvature, but this turns out not to be the case: K takes 
both positive and negative values. 



Figure 3; Curvature of the variance metric with McMullen’s normalization. 


8.3 Intermediate steps 


From Section 4 of [dC76], to explicitly compute the curvature we have the followings 
step. Observe that 
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